Web
This feeder creates a stream starting from a web page. It is possible to define how often the page should be downloaded and parsed.
Every time the page is parsed a Message is sent down the lane.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
url | STRING | empty | URL of the web page |
freq | DURATION | 60s | how often the page should be parsed |
text_only | BOOL | “false” | if “true” it removes all the tags from the page |
method | STRING | “GET” | HTTP method to use on the requests |
headers | JSON | empty | Headers to use in the request |
data | JSON | empty | POST fields to send with the requests (it’s not possible to use in combination with rawData ) |
rawData | STRING | empty | raw body of the requests (it’s not possible to use in combination with data ) |
status | STRING | empty | the filter will propagate the Message only if the returned status is this |
cookies | STRING | empty | Path of the JSON file containing the cookies to use |
... | <web: url="https://example.com", freq="30m", status="200", cookies="/path/to/exported.json"> | ...
Output
Text
The main
field of the Message will contain the HTML source or the text of the website if the text_only
parameter is set to true.
Extra
Name | Description |
---|---|
url | URL of the web page |
title | meta tag title |
description | meta tag description |
image | meta tag image |
sitename | meta tag sitename |
Not all the Extra field could be filled. If the relative tag is not present on the feed it will be empty.
Examples
Soon…