delikabot
"delikabot" is the name for delika's web scraping service. Your web service (including web page, webAPI, FTP, etc.) probably be accessed by delikabot and be transformed into tabular data that will be publicly or privately shared on delika by user's requests.
User agent
The following is the user agent that delikabot uses:
- user agent token:
delikabot
- full user agent:
delikabot/1.0 (+https://docs.delika.io/delikabot.html)
Provide access control for delikabot
You can control what data delikabot collect from your web service in several methods:
Method | Control Type | Details | Notes |
---|---|---|---|
robots exclusion standard | block | delikabot supports robots exclusion standard (also known as robots.txt). Put a text file named robots.txt on your web server root (i.e. /robots.txt) and write User-agent directive with delikabot or wildcard * and Disallow directives of resources in the file to prevent access to specific resources on your web server. |
delikabot doesn't support non-standard directives such like Crawler-delay or Host . |
Cache-Control: no-transform |
block | delikabot performs data conversion from web response into tabular data. Set the response header Cache-Control: no-transform in your web response to opt out from delikabot. |
HTML meta element like <meta http-equiv="Cache-Control" content="no-transform"> also does good if the response content type is text/html . |
error HTTP status codes | block | delikabot doesn't collect results with error HTTP status codes (400-599). | delikabot doesn't stop accessing the web resource even after receiving error responses. |
password protection | private use | Put your secret resource behind a login to prevent delikabot from accessing it. | If an individual account provides login credentials to access private resources, delikabot may access on behalf of the user. |
blacklist | block | delikabot has its own blacklist to avoid accessing specific web resources. Contact us if you are unable to take the technical methods above for some reason. |