How to use the new Apache Http Client to make a HEAD request

(P) Codever is an open source bookmarks and snippets manager for developers & co. See our How To guides to help you get started. Public bookmarks repos on Github ⭐🙏
Contents
If you’ve updated your Apache HTTP Client code to use the newest library (at the time of this writing it is version 4.3.5 for the httpclient and version 4.3.2 for httpcore) from the version 4.2.x you’ll notice that some classes, like org.apache.http.impl.client.DefaultHttpClient
or org.apache.http.params.HttpParams
have become deprecated. Well, I’ve been there, so in this post I’ll present how to get rid of the warnings by using the new classes.
1. Use case from Podcastpedia.org
The use case I will use for demonstration is simple: I have a batch job to check if there are new episodes are available for podcasts. To avoid having to get and parse the feed if there are no new episodes, I verify before if the eTag
or the last-modified
headers of the feed resource have changed since the last call. This will work if the feed publisher supports these headers, which I highly recommend as it spares bandwidth and processing power on the consumers.
So how it works? Initially, when a new podcast is added to the Podcastpedia.org directory I check if the headers are present for the feed resource and if so I store them in the database. To do that, I execute an HTTP HEAD request against the URL of the feed with the help of Apache Http Client. According to the Hypertext Transfer Protocol — HTTP/1.1 rfc2616, the meta-information contained in the HTTP headers in response to a HEAD request SHOULD be identical to the information sent in response to a GET request).
In the following sections I will present how the code actually looks in the Java, before and after the upgrade to the 4.3.x version of the Apache Http Client.
2. Migration to the 4.3.x version
2.1. Software dependencies
To build my project, which by the way is now available on GitHub – Podcastpedia-batch, I am using maven, so I listed below the dependencies required for the Apache Http Client:
2.1.1. Before
<!-- Apache Http client --> <dependency> <groupId>org.apache.httpcomponents</groupId> <artifactId>httpclient</artifactId> <version>4.2.5</version> </dependency> <dependency> <groupId>org.apache.httpcomponents</groupId> <artifactId>httpcore</artifactId> <version>4.2.4</version> </dependency>
2.1.2. After
<!-- Apache Http client --> <dependency> <groupId>org.apache.httpcomponents</groupId> <artifactId>httpclient</artifactId> <version>4.3.5</version> </dependency> <dependency> <groupId>org.apache.httpcomponents</groupId> <artifactId>httpcore</artifactId> <version>4.3.2</version> </dependency>
2.2. HEAD request with Apache Http Client
2.2.1. Before v4.2.x
private void setHeaderFieldAttributes(Podcast podcast) throws ClientProtocolException, IOException, DateParseException{ HttpHead headMethod = null; headMethod = new HttpHead(podcast.getUrl()); org.apache.http.client.HttpClient httpClient = new DefaultHttpClient(poolingClientConnectionManager); HttpParams params = httpClient.getParams(); org.apache.http.params.HttpConnectionParams.setConnectionTimeout(params, 10000); org.apache.http.params.HttpConnectionParams.setSoTimeout(params, 10000); HttpResponse httpResponse = httpClient.execute(headMethod); int statusCode = httpResponse.getStatusLine().getStatusCode(); if (statusCode != HttpStatus.SC_OK) { LOG.error("The introduced URL is not valid " + podcast.getUrl() + " : " + statusCode); } //set the new etag if existent org.apache.http.Header eTagHeader = httpResponse.getLastHeader("etag"); if(eTagHeader != null){ podcast.setEtagHeaderField(eTagHeader.getValue()); } //set the new "last modified" header field if existent org.apache.http.Header lastModifiedHeader= httpResponse.getLastHeader("last-modified"); if(lastModifiedHeader != null) { podcast.setLastModifiedHeaderField(DateUtil.parseDate(lastModifiedHeader.getValue())); podcast.setLastModifiedHeaderFieldStr(lastModifiedHeader.getValue()); } // Release the connection. headMethod.releaseConnection(); }
If you are using a smart IDE, it will tell you that DefaultHttpClient
, HttpParams
and HttpConnectionParams
are deprecated. If you look now in their java docs, you’ll get a suggestion for their replacement, namely to use the HttpClientBuilder
and classes provided by org.apache.http.config
instead.
So, as you’ll see in the coming section, that’s exactly what I did.
2.2.2. After v 4.3.x
private void setHeaderFieldAttributes(Podcast podcast) throws ClientProtocolException, IOException, DateParseException{ HttpHead headMethod = null; headMethod = new HttpHead(podcast.getUrl()); RequestConfig requestConfig = RequestConfig.custom() .setSocketTimeout(TIMEOUT * 1000) .setConnectTimeout(TIMEOUT * 1000) .build(); CloseableHttpClient httpClient = HttpClientBuilder .create() .setDefaultRequestConfig(requestConfig) .setConnectionManager(poolingHttpClientConnectionManager) .build(); HttpResponse httpResponse = httpClient.execute(headMethod); int statusCode = httpResponse.getStatusLine().getStatusCode(); if (statusCode != HttpStatus.SC_OK) { LOG.error("The introduced URL is not valid " + podcast.getUrl() + " : " + statusCode); } //set the new etag if existent Header eTagHeader = httpResponse.getLastHeader("etag"); if(eTagHeader != null){ podcast.setEtagHeaderField(eTagHeader.getValue()); } //set the new "last modified" header field if existent Header lastModifiedHeader= httpResponse.getLastHeader("last-modified"); if(lastModifiedHeader != null) { podcast.setLastModifiedHeaderField(DateUtil.parseDate(lastModifiedHeader.getValue())); podcast.setLastModifiedHeaderFieldStr(lastModifiedHeader.getValue()); } // Release the connection. headMethod.releaseConnection(); }
Notice
HttpClientBuilder
has been used to build a ClosableHttpClient
[lines 11-15], which is a base implementation of HttpClient
that also implements Closeable
HttpParams
from the previous version have been replaced by org.apache.http.client.config.RequestConfig
[lines 6-9] where I can set the socket and connection timeouts. This configuration is later used (line 13) when building the HttpClient
The remaining of the code is quite simple:
- the HEAD request is executed (line 17)
- if existant, the
eTag
andlast-modified
headers are persisted. - in the end the internal state of the request is reset, making it reusable –
headMethod.releaseConnection()
2.2.3. Make the http call from behind a proxy
If you are behind a proxy you can easily configure the HTTP call by setting a org.apache.http.HttpHost
proxy host on the RequestConfig
:
HttpHost proxy = new HttpHost("xx.xx.xx.xx", 8080, "http"); RequestConfig requestConfig = RequestConfig.custom() .setSocketTimeout(TIMEOUT * 1000) .setConnectTimeout(TIMEOUT * 1000) .setProxy(proxy) .build();