haralyzer package

Submodules

haralyzer.assets module

Provides all the main functional classes for analyzing HAR files

class haralyzer.assets.HarEntry(entry: dict)[source]

Bases: MimicDict

An object that represent one entry in a HAR Page

property cache: str
Returns:

Cached objects

Return type:

str

property cookies: list
Returns:

Request and Response Cookies

Return type:

list

property pageref: str
Returns:

Page for the entry

Return type:

str

property port: int
Returns:

Port connection was made to

Return type:

int

property request: Request
Returns:

Request of the entry

Return type:

Request

property response: Response
Returns:

Response of the entry

Return type:

Response

property secure: bool
Returns:

Connection was secure

Return type:

bool

property serverAddress: str
Returns:

IP Address of the server

Return type:

str

property startTime: datetime | None

Start time and date

Returns:

Start time of entry

Return type:

Optional[datetime.datetime]

property status: int
Returns:

HTTP Status Code

Return type:

int

property time: int
Returns:

Time taken to complete entry

Return type:

int

property timings: dict
Returns:

Timing of the page load

Return type:

dict

property url: str
Returns:

URL of Entry

Return type:

str

class haralyzer.assets.HarPage(page_id: str, har_parser: HarParser = None, har_data: dict = None)[source]

Bases: object

An object representing one page of a HAR resource

property actual_page: HarEntry

Returns the first entry object that does not have a redirect status, indicating that it is the actual page we care about (after redirects).

Returns:

First entry of the page

Return type:

HarEntry

property audio_files: List[HarEntry]

All audio files for a page

Returns:

Audio entries for a page

Return type:

List[HarEntry]

property audio_load_time: int

Audio load time

Returns:

Load time for audio on a page

Return type:

int

property audio_size: int

Size of audio files from the page

Returns:

Size of audio files on the page

Return type:

int

property audio_size_trans: int

Audio transfer size

Returns:

Size of transfer data for audio

Return type:

int

property content_load_time: int

Content load time

Returns:

Load time for all content

Return type:

int

property css_files: List[HarEntry]

All CSS files for a page

Returns:

CSS entries for a page

Return type:

List[HarEntry]

property css_load_time: int

CSS load time

Returns:

Load time for CSS on a page

Return type:

int

property css_size: int

Size of CSS files from the page

Returns:

Size of CSS files on the page

Return type:

int

property css_size_trans: int

CSS transfer size

Returns:

Size of transfer data for CSS

Return type:

int

property duplicate_url_request: dict

Returns a dict of urls and its number of repetitions that are sent more than once

Returns:

URLs and the amount of times they were duplicated

Return type:

dict

property entries: List[HarEntry]
Returns:

All entries that make up the page

Return type:

List[HarEntry]

filter_entries(request_type: str = None, content_type: str = None, status_code: str = None, http_version: str = None, load_time__gt: int = None, regex: bool = True) List[HarEntry][source]

Generate a list of entries with from criteria

Parameters:
  • request_type (str) – The request type (i.e. - GET or POST)

  • content_type (str) – Regex to use for finding content type

  • status_code (str) – The desired status code

  • http_version (str) – HTTP version of request

  • load_time__gt (int) – Load time in milliseconds. If provided, an entry whose load time is less than this value will be excluded from the results.

  • regex (bool) – Whether to use regex or exact match.

Returns:

List of entry objects based on the filtered criteria.

Return type:

List[HarEntry]

get_load_time(request_type: str = None, content_type: str = None, status_code: str = None, asynchronous: bool = True, **kwargs) int[source]

This method can return the TOTAL load time for the assets or the ACTUAL load time, the difference being that the actual load time takes asynchronous transactions into account. So, if you want the total load time, set asynchronous=False.

EXAMPLE:

I want to know the load time for images on a page that has two images, each of which took 2 seconds to download, but the browser downloaded them at the same time.

self.get_load_time(content_types=[‘image’]) (returns 2) self.get_load_time(content_types=[‘image’], asynchronous=False) (returns 4)

Parameters:
  • request_type (str) – The request type (i.e. - GET or POST)

  • content_type (str) – Regex to use for finding content type

  • status_code (str) – The desired status code

  • asynchronous (bool) – Whether to separate load times

Returns:

Total load time

Return type:

int

property get_requests: List[HarEntry]

Returns a list of GET requests, each of which is a HarEntry object

Returns:

All GET requests

Return type:

List[HarEntry]

static get_total_size(entries: List[HarEntry]) int[source]

Returns the total size of a collection of entries.

Parameters:

entrieslist of entries to calculate the total size of.

Returns:

Total size of entries

Return type:

int

static get_total_size_trans(entries: List[HarEntry]) int[source]

Returns the total size of a collection of entries - transferred.

NOTE: use with har file generated with chrome-har-capturer

Parameters:

entrieslist of entries to calculate the total size of.

Returns:

Total size of entries that was transferred

Return type:

int

property hostname: str
Returns:

Hostname of the initial request

Return type:

str

property html_files: List[HarEntry]

All HTML files for a page

Returns:

HTML entries for a page

Return type:

List[HarEntry]

property html_load_time: int

HTML load time

Returns:

Load time for HTML on a page

Return type:

int

property image_files: List[HarEntry]

All image files for a page

Returns:

Image entries for a page

Return type:

List[HarEntry]

property image_load_time: int

Image load time

Returns:

Load time for images on a page

Return type:

int

property image_size: int

Size of image files from the page

Returns:

Size of image files on the page

Return type:

int

property image_size_trans: int

Image transfer size

Returns:

Size of transfer data for images

Return type:

int

property initial_load_time: int

Initial load time

Returns:

Initial load time of the page

Return type:

int

property js_files: List[HarEntry]

All JS files for a page

Returns:

JS entries for a page

Return type:

List[HarEntry]

property js_load_time: int

JS load time

Returns:

Load time for JS on a page

Return type:

int

property js_size: int

Size of JS files from the page

Returns:

Size of JS files on the page

Return type:

int

property js_size_trans: int

JS transfer size

Returns:

Size of transfer data for JS

Return type:

int

property page_load_time: int

Load time of the page

Returns:

Load time for the page

Return type:

int

property page_size: int

Size of the page

Returns:

Size of the page

Return type:

int

property page_size_trans: int

Page transfer size

Returns:

Size of transfer data for the page

Return type:

int

property post_requests: List[HarEntry]

Returns a list of POST requests, each of which is an HarEntry object

Returns:

All POST requests

Return type:

List[HarEntry]

property text_files: List[HarEntry]

All text files for a page

Returns:

Text entries for a page

Return type:

List[HarEntry]

property text_size: int

Size of text files from the page

Returns:

Size of text files on the page

Return type:

int

property text_size_trans: int

Text transfer size

Returns:

Size of transfer data for text

Return type:

int

property time_to_first_byte: int | None
Returns:

Time to first byte of the page request in ms

Return type:

int

property url: str | None

The absolute URL of the initial request.

Returns:

URL of first request

Return type:

str

property video_files: List[HarEntry]

All video files for a page

Returns:

Video entries for a page

Return type:

List[HarEntry]

property video_load_time: int

Video load time

Returns:

Load time for video on a page

Return type:

int

property video_size: int

Size of video files from the page

Returns:

Size of video files on the page

Return type:

int

property video_size_trans: int

Video transfer size

Returns:

Size of transfer data for images

Return type:

int

class haralyzer.assets.HarParser(har_data: dict = None)[source]

Bases: object

A Basic HAR parser that also adds helpful stuff for analyzing the performance of a web page.

property browser: str

Browser of Har File

Returns:

Browser of the Har File

Return type:

str

static create_asset_timeline(asset_list: List[HarEntry]) dict[source]

Returns a dict of the timeline for the requested assets. The key is a datetime object (down to the millisecond) of ANY time where at least one of the requested assets was loaded. The value is a list of ALL assets that were loading at that time.

Parameters:

asset_list (List[HarEntry]) – The assets to create a timeline for.

Returns:

Milliseconds and assets that were loaded

Return type:

dict

property creator: str

Creator of Har File. Usually the same as the browser but not always

Returns:

Program that created the HarFile

Return type:

str

static from_file(file: [<class 'str'>, <class 'bytes'>]) HarParser[source]

Function create a HarParser from a file path

Parameters:

file ([str, bytes]) – Path to har file or bytes of har file

Returns:

HarParser Object

:rtype HarParser

static from_string(data: [<class 'str'>, <class 'bytes'>])[source]

Function to load string or bytes as a HarParser

Parameters:

data ([str, bytes]) – Input string or bytes

Returns:

HarParser Object

:rtype HarParser

property hostname: str

Hostname of first page

Returns:

Hostname of the first known page

Return type:

str

static match_content_type(entry: HarEntry, content_type: str, regex: bool = True) bool[source]

Matches the content type of a request using the mimeType metadata.

Parameters:
  • entry (HarEntry) – Entry to analyze

  • content_type (str) – Regex to use for finding content type

  • regex (bool) – Whether to use regex or exact match.

Returns:

Mime type matches

Return type:

bool

static match_headers(entry: HarEntry, header_type: str, header: str, value: str, regex: bool = True) bool[source]

Function to match headers.

Since the output of headers might use different case, like:

‘content-type’ vs ‘Content-Type’

This function is case-insensitive

Parameters:
  • entry (HarEntry) – Entry to analyze

  • header_type (str) – Header type. Valid values: ‘request’, or ‘response’

  • header (str) – The header to search for

  • value (str) – The value to search for

  • regex (bool) – Whether to use regex or exact match

Returns:

Whether a match was found

Return type:

bool

static match_http_version(entry: HarEntry, http_version: str, regex: bool = True) bool[source]

Helper function that returns entries with a request type matching the given request_type argument.

Parameters:
  • entry (HarEntry) – Entry to analyze

  • http_version (str) – HTTP version type to match

  • regex (bool) – Whether to use a regex or string match

Returns:

HTTP version matches

Return type:

bool

static match_request_type(entry: HarEntry, request_type: str, regex: bool = True) bool[source]

Helper function that returns entries with a request type matching the given request_type argument.

Parameters:
  • entry (HarEntry) – Entry to analyze

  • request_type (str) – Request type to match

  • regex (bool) – Whether to use a regex or string match

Returns:

Request method matches

Return type:

bool

static match_status_code(entry: HarEntry, status_code: str, regex: bool = True) bool[source]

Helper function that returns entries with a status code matching then given status_code argument.

NOTE: This is doing a STRING comparison NOT NUMERICAL

Parameters:
  • entry (HarEntry) – Entry to analyze

  • status_code (str) – Status code to search for

  • regex (bool) – Whether to use a regex or string match

Returns:

Status code matches

Return type:

bool

property pages: List[HarPage]

This is a list of HarPage objects, each of which represents a page from the HAR file.

Returns:

HarPages in the file

Return type:

List[HarPage]

property version: str

HAR Version

Returns:

Version of HAR used

Return type:

str

haralyzer.assets.convert_to_entry(func)[source]

Wrapper function for converting dicts of entries to HarEnrty Objects

haralyzer.errors module

Custom exceptions for good ol haralyzer.

exception haralyzer.errors.PageNotFoundError[source]

Bases: AttributeError

Error raised in the Page is not found

haralyzer.http module

Creates the Request and Response sub class that are used by each entry

class haralyzer.http.Request(entry: dict)[source]

Bases: HttpTransaction

Request object for an HarEntry

property accept: str
Returns:

HTTP Accept header

Return type:

str

property bodySize: int
Returns:

Body size of the request

Return type:

int

property cacheControl: str
Returns:

HTTP CacheControl header

Return type:

str

property cookies: list
Returns:

Cookies from the request

Return type:

list

property encoding: str
Returns:

HTTP Accept-Encoding Header

Return type:

str

property headersSize: int
Returns:

Headers size from the request

Return type:

int

property host: str
Returns:

HTTP Host header

Return type:

str

property httpVersion: str
Returns:

HTTP version used in the request

Return type:

str

property language: str
Returns:

HTTP language header

Return type:

str

property method: str
Returns:

HTTP method of the request

Return type:

str

property mimeType: str | None
Returns:

Mime Type of request

Return type:

str

property queryString: list
Returns:

Query string from the request

Return type:

list

property text: str | None
Returns:

Request body

Return type:

str

property url: str
Returns:

URL of the request

Return type:

str

property userAgent: str
Returns:

User Agent

Return type:

str

class haralyzer.http.Response(url: str, entry: dict)[source]

Bases: HttpTransaction

Response object for a HarEntry

property bodySize: int
Returns:

Body Size

Return type:

int

property cacheControl: str
Returns:

Cache Control Header

Return type:

str

property contentSecurityPolicy: str
Returns:

Content Security Policy Header

Return type:

str

property contentSize: int
Returns:

Content Size

Return type:

int

property contentType: str
Returns:

Content Type

Return type:

str

property date: str
Returns:

Date of response

Return type:

str

property headersSize: int
Returns:

Header size

Return type:

int

property httpVersion: str
Returns:

HTTP Version

Return type:

str

property lastModified: str
Returns:

Last modified time

Return type:

str

property mimeType: str
Returns:

Mime Type of response

Return type:

str

property redirectURL: str | None
Returns:

Redirect URL

Return type:

Optional[str]

property status: int
Returns:

HTTP Status

Return type:

int

property statusText: str
Returns:

HTTP Status Text

Return type:

str

property text: str
Returns:

Response body

Return type:

str

property textEncoding: str
Returns:

How the response body is encoded

Return type:

str

haralyzer.mixins module

Mixin Objects that allow for shared methods

class haralyzer.mixins.GetHeaders[source]

Bases: object

Mixin to get a header

get_header_value(name: str) str | None[source]

Returns the header value of the header defined in name

Parameters:

name (str) – Name of the header to get the value of

Returns:

Value of the header

Return type:

Optional[str]

class haralyzer.mixins.HttpTransaction(entry: dict)[source]

Bases: GetHeaders, MimicDict

Class the represents a request or response

property formatted: str

Formatted HttpTransaction string for pretty print.

Returns:

formatted string

Return type:

str

property headers: list

Headers from the entry

Returns:

Headers from both request and response

Return type:

list

class haralyzer.mixins.MimicDict[source]

Bases: MutableMapping

Mixin for functions to mimic a dictionary for backward compatibility

haralyzer.multihar module

Contains the mutlihar parse object

class haralyzer.multihar.MultiHarParser(har_data, page_id=None, decimal_precision=0)[source]

Bases: object

An object that represents multiple HAR files OF THE SAME CONTENT. It is used to gather overall statistical data in situations where you have multiple runs against the same web asset, which is common in performance testing.

property asset_types: dict

Mimic the asset types stored in HarPage

Returns:

Asset types from HarPage

Return type:

dict

property audio_load_time: int | float
Returns:

Aggregate audio load time for all pages. Can be an int or float depending on the self.decimal_precision

Return type:

int, float

property css_load_time: int | float
Returns:

Aggregate css load time for all pages. Can be an int or float depending on the self.decimal_precision

Return type:

int, float

get_load_times(asset_type: str) list[source]

Just a list of the load times of a certain asset type for each page

Parameters:

asset_type (str) – The asset type to return load times for

Returns:

List of load times

Return type:

list

get_stdev(asset_type: str) int | float[source]

Returns the standard deviation for a set of a certain asset type.

Parameters:

asset_type (str) – The asset type to calculate standard deviation for.

Returns:

Standard deviation, which can be an int or float depending on the self.decimal_precision

Return type:

int, float

property html_load_time: int | float
Returns:

Aggregate html load time for all pages. Can be an int or float depending on the self.decimal_precision

Return type:

int, float

property image_load_time: int | float
Returns:

Aggregate image load time for all pages. Can be an int or float depending on the self.decimal_precision

Return type:

int, float

property js_load_time: int | float
Returns:

Aggregate javascript load time. Can be an int or float depending on the self.decimal_precision

Return type:

int, float

property page_load_time: int | float
Returns:

Average total load time for all runs (not weighted). Can be an int or float depending on the self.decimal_precision

Return type:

int, float

property pages: List[HarPage]

Aggregate pages of all the parser objects.

Returns:

All the pages from parsers

Return type:

List[haralyzer.assets.HarPage]

property time_to_first_byte: int | float
Returns:

The aggregate time to first byte for all pages. Can be an int or float depending on the self.decimal_precision

Return type:

int, float

property video_load_time: int | float
Returns:

Aggregate video load time for all pages. Can be an int or float depending on the self.decimal_precision

Return type:

int, float