This documentation is also published as Markdown for efficient machine reading: the whole site is indexed at /llms.txt, and every page has a clean Markdown copy under /_llms/. These are generated from the same source and cost far fewer tokens to read than this rendered HTML.

Skip to main content Skip to navigation

HeadingSectionExtractor Pennington.Search

Splits post-pipeline page HTML into one HeadingSection per heading (plus a lead section) so the search index can carry heading-level records that deep-link to anchors. Walks the rendered content element in document order; h2h6 with an id start a new section, h1 is treated as the page title (not indexed into a section body), and <pre> subtrees are dropped when code blocks are excluded.

Methods

Extract

#
public IReadOnlyList<HeadingSection> Extract(IElement content, bool excludeCodeBlocks)

Extracts the lead section plus one section per anchored heading from content.

Parameters

content IElement
excludeCodeBlocks bool

Returns

IReadOnlyList<HeadingSection>

Pennington.Search.HeadingSectionExtractor

namespace Pennington.Search;

/// Splits post-pipeline page HTML into one HeadingSection per heading (plus a lead section) so the search index can carry heading-level records that deep-link to anchors. Walks the rendered content element in document order; h2h6 with an id start a new section, h1 is treated as the page title (not indexed into a section body), and <pre> subtrees are dropped when code blocks are excluded.
public class HeadingSectionExtractor
{
    /// Extracts the lead section plus one section per anchored heading from content.
    
public IReadOnlyList<HeadingSection> Extract(IElement content, bool excludeCodeBlocks)
; }