Html
This filter is used to extract information from an HTML page received through the Message.
Like jQuery for JS, we can set a selector to find in the page, extract text from the tags and the html content (we are using goquery library under the hood).
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
target | STRING | “main” | the field of the Message that should be used for the filter (it could be main or an extra field) |
selector | STRING | "" | the selector to find in the HTML page |
get | STRING | “html” | what do we want to retrieve on the tags found in the selected one: html , text , attr |
attr | STRING | "" | if get is attr you can define what attr name it should extract |
... | html(selector=".link", get="attr", attr="href") | ...
Output
The filter will generate one or more Messages. It is possible to use more than 1 time this filter.
The field fulltext
will contain the original target
string.
Examples
Soon…