Find Broken Images in Web-Page

Broken Images

Hello Friends....
While testing the web-page, It always happen when page renders properly but Image/Images are not displayed due to incorrect path
Technically images which are not having correct path are called as Broken Images.
Basically Selenium help us to mimic human actions (e.g. clicking, typing, dragging, dropping, etc.)
So how do we use it to test for broken images?

Solution!!!

Selenium WebDriver is not directly equipped with this... but there are several way to do this..
We will use HttpURLConnection Object with selenium to do this.We will need to go through the below steps.

  1. Find all images on the page
  2. Iterate through each image, and find the src attribute and validate with a 404 status code
  3. Store / Notify / Log the broken images path in a collection

HttpURLConnection Example

Please look into below code snippet
import java.net.HttpURLConnection;
import java.net.InetSocketAddress;
import java.net.Proxy;
import java.net.URL;
import java.util.ArrayList;
import java.util.List;

import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.firefox.FirefoxDriver;

public class ImageUtills {
 
 public static List<WebElement> getBrokenLinks(String Weburl) {
  WebDriver driver = new FirefoxDriver();
  driver.get(Weburl);
  List<WebElement> Images = driver.findElements(By.xpath("//img"));
  List<WebElement> brokenImages = new ArrayList<WebElement>();
  
  //Use Proxy if your network is under any proxy server
  Proxy proxy = new Proxy(Proxy.Type.HTTP, new InetSocketAddress("Your Proxy Server", 80));
  for(WebElement image:Images){
   String url= image.getAttribute("src");
   try 
   {
    //You can keep empty if there is no proxy.
    HttpURLConnection http = (HttpURLConnection)new URL(url).openConnection(proxy);
    if(http.getResponseCode()!=200){
     brokenImages.add(image);
    }
   
   } 
   catch (Exception e) {
    brokenImages.add(image);
    e.printStackTrace();
   } 
  }
  
  /// You can further use brokenImages List for display or notify or assert
  return brokenImages;
  
 }

}

Below is Different Response codes and there meanings.. it may help you some where

1xx: Information

Message: Description:
100 Continue The server has received the request headers, and the client should proceed to send the request body
101 Switching Protocols The requester has asked the server to switch protocols
103 Checkpoint Used in the resumable requests proposal to resume aborted PUT or POST requests

2xx: Successful

Message: Description:
200 OK The request is OK (this is the standard response for successful HTTP requests)
201 Created The request has been fulfilled, and a new resource is created
202 Accepted The request has been accepted for processing, but the processing has not been completed
203 Non-Authoritative Information The request has been successfully processed, but is returning information that may be from another source
204 No Content The request has been successfully processed, but is not returning any content
205 Reset Content The request has been successfully processed, but is not returning any content, and requires that the requester reset the document view
206 Partial Content The server is delivering only part of the resource due to a range header sent by the client

3xx: Redirection

Message: Description:
300 Multiple Choices A link list. The user can select a link and go to that location. Maximum five addresses
301 Moved Permanently The requested page has moved to a new URL
302 Found The requested page has moved temporarily to a new URL
303 See Other The requested page can be found under a different URL
304 Not Modified Indicates the requested page has not been modified since last requested
306 Switch Proxy No longer used
307 Temporary Redirect The requested page has moved temporarily to a new URL
308 Resume Incomplete Used in the resumable requests proposal to resume aborted PUT or POST requests

4xx: Client Error

Message: Description:
400 Bad Request The request cannot be fulfilled due to bad syntax
401 Unauthorized The request was a legal request, but the server is refusing to respond to it. For use when authentication is possible but has failed or not yet been provided
402 Payment Required Reserved for future use
403 Forbidden The request was a legal request, but the server is refusing to respond to it
404 Not Found The requested page could not be found but may be available again in the future
405 Method Not Allowed A request was made of a page using a request method not supported by that page
406 Not Acceptable The server can only generate a response that is not accepted by the client
407 Proxy Authentication Required The client must first authenticate itself with the proxy
408 Request Timeout The server timed out waiting for the request
409 Conflict The request could not be completed because of a conflict in the request
410 Gone The requested page is no longer available
411 Length Required The "Content-Length" is not defined. The server will not accept the request without it
412 Precondition Failed The precondition given in the request evaluated to false by the server
413 Request Entity Too Large The server will not accept the request, because the request entity is too large
414 Request-URI Too Long The server will not accept the request, because the URL is too long. Occurs when you convert a POST request to a GET request with a long query information
415 Unsupported Media Type The server will not accept the request, because the media type is not supported
416 Requested Range Not Satisfiable The client has asked for a portion of the file, but the server cannot supply that portion
417 Expectation Failed The server cannot meet the requirements of the Expect request-header field

5xx: Server Error

Message: Description:
500 Internal Server Error A generic error message, given when no more specific message is suitable
501 Not Implemented The server either does not recognize the request method, or it lacks the ability to fulfill the request
502 Bad Gateway The server was acting as a gateway or proxy and received an invalid response from the upstream server
503 Service Unavailable The server is currently unavailable (overloaded or down)
504 Gateway Timeout The server was acting as a gateway or proxy and did not receive a timely response from the upstream server
505 HTTP Version Not Supported The server does not support the HTTP protocol version used in the request
511 Network Authentication Required The client needs to authenticate to gain network access

Hope you learn something from this post...


Keep Automating........
0