Web
This feeder creates a stream starting from a web page. It is possible to define how often the page should be downloaded and parsed.
Every time the page is parsed a Message is sent down the lane.
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
| url | STRING | empty | URL of the web page |
| freq | DURATION | 60s | how often the page should be parsed |
| text_only | BOOL | “false” | if “true” it removes all the tags from the page |
| method | STRING | “GET” | HTTP method to use on the requests |
| headers | JSON | empty | Headers to use in the request |
| data | JSON | empty | POST fields to send with the requests (it’s not possible to use in combination with rawData) |
| rawData | STRING | empty | raw body of the requests (it’s not possible to use in combination with data) |
| status | STRING | empty | the filter will propagate the Message only if the returned status is this |
| cookies | STRING | empty | Path of the JSON file containing the cookies to use |
... | <web: url="https://example.com", freq="30m", status="200", cookies="/path/to/exported.json"> | ...Output
Text
The main field of the Message will contain the HTML source or the text of the website if the text_only parameter is set to true.
Extra
| Name | Description |
|---|---|
| url | URL of the web page |
| title | meta tag title |
| description | meta tag description |
| image | meta tag image |
| sitename | meta tag sitename |
Not all the Extra field could be filled. If the relative tag is not present on the feed it will be empty.
Examples
Soon…