haralyzer package¶
Submodules¶
haralyzer.assets module¶
Provides all the main functional classes for analyzing HAR files
- class haralyzer.assets.HarEntry(entry: dict)[source]¶
Bases:
MimicDictAn object that represent one entry in a HAR Page
- property cache: str¶
- Returns:
Cached objects
- Return type:
str
- property cookies: list¶
- Returns:
Request and Response Cookies
- Return type:
list
- property pageref: str¶
- Returns:
Page for the entry
- Return type:
str
- property port: int¶
- Returns:
Port connection was made to
- Return type:
int
- property secure: bool¶
- Returns:
Connection was secure
- Return type:
bool
- property serverAddress: str¶
- Returns:
IP Address of the server
- Return type:
str
- property startTime: datetime | None¶
Start time and date
- Returns:
Start time of entry
- Return type:
Optional[datetime.datetime]
- property status: int¶
- Returns:
HTTP Status Code
- Return type:
int
- property time: int¶
- Returns:
Time taken to complete entry
- Return type:
int
- property timings: dict¶
- Returns:
Timing of the page load
- Return type:
dict
- property url: str¶
- Returns:
URL of Entry
- Return type:
str
- class haralyzer.assets.HarPage(page_id: str, har_parser: HarParser = None, har_data: dict = None)[source]¶
Bases:
objectAn object representing one page of a HAR resource
- property actual_page: HarEntry¶
Returns the first entry object that does not have a redirect status, indicating that it is the actual page we care about (after redirects).
- Returns:
First entry of the page
- Return type:
- property audio_files: List[HarEntry]¶
All audio files for a page
- Returns:
Audio entries for a page
- Return type:
List[HarEntry]
- property audio_load_time: int¶
Audio load time
- Returns:
Load time for audio on a page
- Return type:
int
- property audio_size: int¶
Size of audio files from the page
- Returns:
Size of audio files on the page
- Return type:
int
- property audio_size_trans: int¶
Audio transfer size
- Returns:
Size of transfer data for audio
- Return type:
int
- property content_load_time: int¶
Content load time
- Returns:
Load time for all content
- Return type:
int
- property css_files: List[HarEntry]¶
All CSS files for a page
- Returns:
CSS entries for a page
- Return type:
List[HarEntry]
- property css_load_time: int¶
CSS load time
- Returns:
Load time for CSS on a page
- Return type:
int
- property css_size: int¶
Size of CSS files from the page
- Returns:
Size of CSS files on the page
- Return type:
int
- property css_size_trans: int¶
CSS transfer size
- Returns:
Size of transfer data for CSS
- Return type:
int
- property duplicate_url_request: dict¶
Returns a dict of urls and its number of repetitions that are sent more than once
- Returns:
URLs and the amount of times they were duplicated
- Return type:
dict
- property entries: List[HarEntry]¶
- Returns:
All entries that make up the page
- Return type:
List[HarEntry]
- filter_entries(request_type: str = None, content_type: str = None, status_code: str = None, http_version: str = None, load_time__gt: int = None, regex: bool = True) List[HarEntry][source]¶
Generate a list of entries with from criteria
- Parameters:
request_type (str) – The request type (i.e. - GET or POST)
content_type (str) – Regex to use for finding content type
status_code (str) – The desired status code
http_version (str) – HTTP version of request
load_time__gt (int) – Load time in milliseconds. If provided, an entry whose load time is less than this value will be excluded from the results.
regex (bool) – Whether to use regex or exact match.
- Returns:
List of entry objects based on the filtered criteria.
- Return type:
List[HarEntry]
- get_load_time(request_type: str = None, content_type: str = None, status_code: str = None, asynchronous: bool = True, **kwargs) int[source]¶
This method can return the TOTAL load time for the assets or the ACTUAL load time, the difference being that the actual load time takes asynchronous transactions into account. So, if you want the total load time, set asynchronous=False.
EXAMPLE:
I want to know the load time for images on a page that has two images, each of which took 2 seconds to download, but the browser downloaded them at the same time.
self.get_load_time(content_types=[‘image’]) (returns 2) self.get_load_time(content_types=[‘image’], asynchronous=False) (returns 4)
- Parameters:
request_type (str) – The request type (i.e. - GET or POST)
content_type (str) – Regex to use for finding content type
status_code (str) – The desired status code
asynchronous (bool) – Whether to separate load times
- Returns:
Total load time
- Return type:
int
- property get_requests: List[HarEntry]¶
Returns a list of GET requests, each of which is a HarEntry object
- Returns:
All GET requests
- Return type:
List[HarEntry]
- static get_total_size(entries: List[HarEntry]) int[source]¶
Returns the total size of a collection of entries.
- Parameters:
entries –
listof entries to calculate the total size of.- Returns:
Total size of entries
- Return type:
int
- static get_total_size_trans(entries: List[HarEntry]) int[source]¶
Returns the total size of a collection of entries - transferred.
NOTE: use with har file generated with chrome-har-capturer
- Parameters:
entries –
listof entries to calculate the total size of.- Returns:
Total size of entries that was transferred
- Return type:
int
- property hostname: str¶
- Returns:
Hostname of the initial request
- Return type:
str
- property html_files: List[HarEntry]¶
All HTML files for a page
- Returns:
HTML entries for a page
- Return type:
List[HarEntry]
- property html_load_time: int¶
HTML load time
- Returns:
Load time for HTML on a page
- Return type:
int
- property image_files: List[HarEntry]¶
All image files for a page
- Returns:
Image entries for a page
- Return type:
List[HarEntry]
- property image_load_time: int¶
Image load time
- Returns:
Load time for images on a page
- Return type:
int
- property image_size: int¶
Size of image files from the page
- Returns:
Size of image files on the page
- Return type:
int
- property image_size_trans: int¶
Image transfer size
- Returns:
Size of transfer data for images
- Return type:
int
- property initial_load_time: int¶
Initial load time
- Returns:
Initial load time of the page
- Return type:
int
- property js_files: List[HarEntry]¶
All JS files for a page
- Returns:
JS entries for a page
- Return type:
List[HarEntry]
- property js_load_time: int¶
JS load time
- Returns:
Load time for JS on a page
- Return type:
int
- property js_size: int¶
Size of JS files from the page
- Returns:
Size of JS files on the page
- Return type:
int
- property js_size_trans: int¶
JS transfer size
- Returns:
Size of transfer data for JS
- Return type:
int
- property page_load_time: int¶
Load time of the page
- Returns:
Load time for the page
- Return type:
int
- property page_size: int¶
Size of the page
- Returns:
Size of the page
- Return type:
int
- property page_size_trans: int¶
Page transfer size
- Returns:
Size of transfer data for the page
- Return type:
int
- property post_requests: List[HarEntry]¶
Returns a list of POST requests, each of which is an HarEntry object
- Returns:
All POST requests
- Return type:
List[HarEntry]
- property text_files: List[HarEntry]¶
All text files for a page
- Returns:
Text entries for a page
- Return type:
List[HarEntry]
- property text_size: int¶
Size of text files from the page
- Returns:
Size of text files on the page
- Return type:
int
- property text_size_trans: int¶
Text transfer size
- Returns:
Size of transfer data for text
- Return type:
int
- property time_to_first_byte: int | None¶
- Returns:
Time to first byte of the page request in ms
- Return type:
int
- property url: str | None¶
The absolute URL of the initial request.
- Returns:
URL of first request
- Return type:
str
- property video_files: List[HarEntry]¶
All video files for a page
- Returns:
Video entries for a page
- Return type:
List[HarEntry]
- property video_load_time: int¶
Video load time
- Returns:
Load time for video on a page
- Return type:
int
- property video_size: int¶
Size of video files from the page
- Returns:
Size of video files on the page
- Return type:
int
- property video_size_trans: int¶
Video transfer size
- Returns:
Size of transfer data for images
- Return type:
int
- class haralyzer.assets.HarParser(har_data: dict = None)[source]¶
Bases:
objectA Basic HAR parser that also adds helpful stuff for analyzing the performance of a web page.
- property browser: str¶
Browser of Har File
- Returns:
Browser of the Har File
- Return type:
str
- static create_asset_timeline(asset_list: List[HarEntry]) dict[source]¶
Returns a dict of the timeline for the requested assets. The key is a datetime object (down to the millisecond) of ANY time where at least one of the requested assets was loaded. The value is a list of ALL assets that were loading at that time.
- Parameters:
asset_list (List[HarEntry]) – The assets to create a timeline for.
- Returns:
Milliseconds and assets that were loaded
- Return type:
dict
- property creator: str¶
Creator of Har File. Usually the same as the browser but not always
- Returns:
Program that created the HarFile
- Return type:
str
- static from_file(file: [<class 'str'>, <class 'bytes'>]) HarParser[source]¶
Function create a HarParser from a file path
- Parameters:
file ([str, bytes]) – Path to har file or bytes of har file
- Returns:
HarParser Object
:rtype HarParser
- static from_string(data: [<class 'str'>, <class 'bytes'>])[source]¶
Function to load string or bytes as a HarParser
- Parameters:
data ([str, bytes]) – Input string or bytes
- Returns:
HarParser Object
:rtype HarParser
- property hostname: str¶
Hostname of first page
- Returns:
Hostname of the first known page
- Return type:
str
- static match_content_type(entry: HarEntry, content_type: str, regex: bool = True) bool[source]¶
Matches the content type of a request using the mimeType metadata.
- Parameters:
entry (HarEntry) – Entry to analyze
content_type (str) – Regex to use for finding content type
regex (bool) – Whether to use regex or exact match.
- Returns:
Mime type matches
- Return type:
bool
- static match_headers(entry: HarEntry, header_type: str, header: str, value: str, regex: bool = True) bool[source]¶
Function to match headers.
Since the output of headers might use different case, like:
‘content-type’ vs ‘Content-Type’
This function is case-insensitive
- Parameters:
entry (HarEntry) – Entry to analyze
header_type (str) – Header type. Valid values: ‘request’, or ‘response’
header (str) – The header to search for
value (str) – The value to search for
regex (bool) – Whether to use regex or exact match
- Returns:
Whether a match was found
- Return type:
bool
- static match_http_version(entry: HarEntry, http_version: str, regex: bool = True) bool[source]¶
Helper function that returns entries with a request type matching the given request_type argument.
- Parameters:
entry (HarEntry) – Entry to analyze
http_version (str) – HTTP version type to match
regex (bool) – Whether to use a regex or string match
- Returns:
HTTP version matches
- Return type:
bool
- static match_request_type(entry: HarEntry, request_type: str, regex: bool = True) bool[source]¶
Helper function that returns entries with a request type matching the given request_type argument.
- Parameters:
entry (HarEntry) – Entry to analyze
request_type (str) – Request type to match
regex (bool) – Whether to use a regex or string match
- Returns:
Request method matches
- Return type:
bool
- static match_status_code(entry: HarEntry, status_code: str, regex: bool = True) bool[source]¶
Helper function that returns entries with a status code matching then given status_code argument.
NOTE: This is doing a STRING comparison NOT NUMERICAL
- Parameters:
entry (HarEntry) – Entry to analyze
status_code (str) – Status code to search for
regex (bool) – Whether to use a regex or string match
- Returns:
Status code matches
- Return type:
bool
- property pages: List[HarPage]¶
This is a list of HarPage objects, each of which represents a page from the HAR file.
- Returns:
HarPages in the file
- Return type:
List[HarPage]
- property version: str¶
HAR Version
- Returns:
Version of HAR used
- Return type:
str
haralyzer.errors module¶
Custom exceptions for good ol haralyzer.
haralyzer.http module¶
Creates the Request and Response sub class that are used by each entry
- class haralyzer.http.Request(entry: dict)[source]¶
Bases:
HttpTransactionRequest object for an HarEntry
- property accept: str¶
- Returns:
HTTP Accept header
- Return type:
str
- property bodySize: int¶
- Returns:
Body size of the request
- Return type:
int
- property cacheControl: str¶
- Returns:
HTTP CacheControl header
- Return type:
str
- property cookies: list¶
- Returns:
Cookies from the request
- Return type:
list
- property encoding: str¶
- Returns:
HTTP Accept-Encoding Header
- Return type:
str
- property headersSize: int¶
- Returns:
Headers size from the request
- Return type:
int
- property host: str¶
- Returns:
HTTP Host header
- Return type:
str
- property httpVersion: str¶
- Returns:
HTTP version used in the request
- Return type:
str
- property language: str¶
- Returns:
HTTP language header
- Return type:
str
- property method: str¶
- Returns:
HTTP method of the request
- Return type:
str
- property mimeType: str | None¶
- Returns:
Mime Type of request
- Return type:
str
- property queryString: list¶
- Returns:
Query string from the request
- Return type:
list
- property text: str | None¶
- Returns:
Request body
- Return type:
str
- property url: str¶
- Returns:
URL of the request
- Return type:
str
- property userAgent: str¶
- Returns:
User Agent
- Return type:
str
- class haralyzer.http.Response(url: str, entry: dict)[source]¶
Bases:
HttpTransactionResponse object for a HarEntry
- property bodySize: int¶
- Returns:
Body Size
- Return type:
int
- property cacheControl: str¶
- Returns:
Cache Control Header
- Return type:
str
- property contentSecurityPolicy: str¶
- Returns:
Content Security Policy Header
- Return type:
str
- property contentSize: int¶
- Returns:
Content Size
- Return type:
int
- property contentType: str¶
- Returns:
Content Type
- Return type:
str
- property date: str¶
- Returns:
Date of response
- Return type:
str
- property headersSize: int¶
- Returns:
Header size
- Return type:
int
- property httpVersion: str¶
- Returns:
HTTP Version
- Return type:
str
- property lastModified: str¶
- Returns:
Last modified time
- Return type:
str
- property mimeType: str¶
- Returns:
Mime Type of response
- Return type:
str
- property redirectURL: str | None¶
- Returns:
Redirect URL
- Return type:
Optional[str]
- property status: int¶
- Returns:
HTTP Status
- Return type:
int
- property statusText: str¶
- Returns:
HTTP Status Text
- Return type:
str
- property text: str¶
- Returns:
Response body
- Return type:
str
- property textEncoding: str¶
- Returns:
How the response body is encoded
- Return type:
str
haralyzer.mixins module¶
Mixin Objects that allow for shared methods
- class haralyzer.mixins.HttpTransaction(entry: dict)[source]¶
Bases:
GetHeaders,MimicDictClass the represents a request or response
- property formatted: str¶
Formatted HttpTransaction string for pretty print.
- Returns:
formatted string
- Return type:
str
- property headers: list¶
Headers from the entry
- Returns:
Headers from both request and response
- Return type:
list
haralyzer.multihar module¶
Contains the mutlihar parse object
- class haralyzer.multihar.MultiHarParser(har_data, page_id=None, decimal_precision=0)[source]¶
Bases:
objectAn object that represents multiple HAR files OF THE SAME CONTENT. It is used to gather overall statistical data in situations where you have multiple runs against the same web asset, which is common in performance testing.
- property asset_types: dict¶
Mimic the asset types stored in HarPage
- Returns:
Asset types from HarPage
- Return type:
dict
- property audio_load_time: int | float¶
- Returns:
Aggregate audio load time for all pages. Can be an int or float depending on the self.decimal_precision
- Return type:
int, float
- property css_load_time: int | float¶
- Returns:
Aggregate css load time for all pages. Can be an int or float depending on the self.decimal_precision
- Return type:
int, float
- get_load_times(asset_type: str) list[source]¶
Just a list of the load times of a certain asset type for each page
- Parameters:
asset_type (str) – The asset type to return load times for
- Returns:
List of load times
- Return type:
list
- get_stdev(asset_type: str) int | float[source]¶
Returns the standard deviation for a set of a certain asset type.
- Parameters:
asset_type (str) – The asset type to calculate standard deviation for.
- Returns:
Standard deviation, which can be an int or float depending on the self.decimal_precision
- Return type:
int, float
- property html_load_time: int | float¶
- Returns:
Aggregate html load time for all pages. Can be an int or float depending on the self.decimal_precision
- Return type:
int, float
- property image_load_time: int | float¶
- Returns:
Aggregate image load time for all pages. Can be an int or float depending on the self.decimal_precision
- Return type:
int, float
- property js_load_time: int | float¶
- Returns:
Aggregate javascript load time. Can be an int or float depending on the self.decimal_precision
- Return type:
int, float
- property page_load_time: int | float¶
- Returns:
Average total load time for all runs (not weighted). Can be an int or float depending on the self.decimal_precision
- Return type:
int, float
- property pages: List[HarPage]¶
Aggregate pages of all the parser objects.
- Returns:
All the pages from parsers
- Return type:
List[haralyzer.assets.HarPage]
- property time_to_first_byte: int | float¶
- Returns:
The aggregate time to first byte for all pages. Can be an int or float depending on the self.decimal_precision
- Return type:
int, float
- property video_load_time: int | float¶
- Returns:
Aggregate video load time for all pages. Can be an int or float depending on the self.decimal_precision
- Return type:
int, float