Download Location
[Mirror1 download link] - file hosted by oss.sonatype.orgA powerful web crawler that can also extract and manipulate documents in order to retrieve the required information on the InternetWhats new in version 2.7.0:
• Now supports NTLM authentication. SPNEGO and Kerberos were also added but are experimental (see DefaultHttpClientFactory).
• Can now specify character set of HTTP connections and authentication forms.
• Can now set custom timeout values on HTTP connection-related activities.
• New option to trust all SSL certificates of sites being crawled (see DefaultHttpClientFactory).
• Can now specify a maximum number of HTTP connections for each crawler independently of configured number of threads (see DefaultHttpClientFactory).
• DefaultHttpClientFactory introduces additional configuration options: proxy scheme, \'Expect: 100-continue\' handshake, maximum HTTP redirects, local address, stale connection checks
• HTTP header checksum and document checksum are now added to the document metadata as HttpMetadata#CHECKSUM_HEADER and HttpMetadata#CHECKSUM_DOC.
• The empty sub-folders contained under the "download" folder are now periodically deleted. This speeds up directory scanning and increases...
You are about to download a GPL version for Norconex HTTP Collector. This download links are providet to you by software publisher