Extractors¶
- class chepy.modules.extractors.Extractors(*data)¶
- aws_account_id_from_access_key()¶
Extract AWS account id from access key
- Returns
The Chepy object.
- Return type
- css_selector(query: str)¶
Extract data using valid CSS selectors
- Parameters
query (str) – Required. CSS query
- Returns
The Chepy object.
- Return type
Examples
>>> c = Chepy("http://example.com") >>> c.http_request() >>> c.css_selector("title") >>> c.get_by_index(0) >>> c.o "<title>Example Domain</title>"
- decode_zero_width(_zw_chars: str = '\u200c\u200d\u202c\ufeff') ExtractorsT ¶
Extract zero with characters. Decode implementation of https://330k.github.io/misc_tools/unicode_steganography.html
- Parameters
chars (str, optional) – Characters for stego. Defaults to ‘’.
- Returns
The Chepy object.
- Return type
- extract_auth_basic() ExtractorsT ¶
Extract basic authentication tokens
- Returns
The Chepy object.
- Return type
- extract_auth_bearer() ExtractorsT ¶
Extract bearer authentication tokens
- Returns
The Chepy object.
- Return type
- extract_base64(min: int = 20) ExtractorsT ¶
Extract base64 encoded strings
- Parameters
min (int, optional) – Minimum length to match. Defaults to 20.
- Returns
The Chepy object.
- Return type
- extract_domains(is_binary: bool = False) ExtractorsT ¶
Extract domains
- Parameters
is_binary (bool, optional) – The state is in binary format. It will then first extract the strings from it before matching.
- Returns
The Chepy object.
- Return type
- extract_dsa_private() ExtractorsT ¶
Extract DSA private key
- Returns
The Chepy object.
- Return type
- extract_email(is_binary: bool = False) ExtractorsT ¶
Extract email
- Parameters
is_binary (bool, optional) – The state is in binary format. It will then first extract the strings from it before matching.
- Returns
The Chepy object.
- Return type
Examples
Sometimes, the state is in a binary format, and not readable. In this case set the binary flag to True.
>>> Chepy("tests/files/test.der").load_file().extract_email(is_binary=True).o
- extract_facebook_access_token() ExtractorsT ¶
Extract Facebook access tokens
- Returns
The Chepy object.
- Return type
- extract_github() ExtractorsT ¶
Extract Github access token
- Returns
The Chepy object.
- Return type
- extract_google_api() ExtractorsT ¶
Extract Goolge api keys
- Returns
The Chepy object.
- Return type
- extract_google_captcha() ExtractorsT ¶
Extract Goolge captcha keys
- Returns
The Chepy object.
- Return type
- extract_google_oauth() ExtractorsT ¶
Extract Goolge oauth keys
- Returns
The Chepy object.
- Return type
- extract_hashes() ExtractorsT ¶
Extract md5, sha1, sha256 and sha512 hashes
- Returns
The Chepy object.
- Return type
Examples
>>> Chepy( >>> ["60b725f10c9c85c70d97880dfe8191b3", "3f786850e387550fdab836ed7e6dc881de23001b"] >>> ).extract_hashes() {'md5': [b'60b725f10c9c85c70d97880dfe8191b3'], 'sha1': [b'3f786850e387550fdab836ed7e6dc881de23001b'], 'sha256': [], 'sha512': []}
- extract_html_tags(tags: List[str])¶
Extract tags from html along with their attributes
- Parameters
tag (str) – A HTML tag
- Returns
The Chepy object.
- Return type
Examples
>>> Chepy("http://example.com").http_request().html_tags(['p']).o [ {'tag': 'p', 'attributes': {}}, {'tag': 'p', 'attributes': {}}, {'tag': 'p', 'attributes': {}} ]
- extract_ips(is_binary: bool = False) ExtractorsT ¶
Extract ipv4 and ipv6 addresses
- Parameters
is_binary (bool, optional) – The state is in binary format. It will then first extract the strings from it before matching.
- Returns
The Chepy object.
- Return type
- extract_mac_address(is_binary: bool = False) ExtractorsT ¶
Extract MAC addresses
- Parameters
is_binary (bool, optional) – The state is in binary format. It will then first extract the strings from it before matching.
- Returns
The Chepy object.
- Return type
- extract_mailgun_api() ExtractorsT ¶
Extract Mailgun API key
- Returns
The Chepy object.
- Return type
- extract_paypal_bt() ExtractorsT ¶
Extract Paypal braintree access token
- Returns
The Chepy object.
- Return type
- extract_rsa_private() ExtractorsT ¶
Extract RSA private key
- Returns
The Chepy object.
- Return type
- extract_square_access() ExtractorsT ¶
Extract Square access token
- Returns
The Chepy object.
- Return type
- extract_square_oauth() ExtractorsT ¶
Extract Square oauth secret token
- Returns
The Chepy object.
- Return type
- extract_strings(length: int = 4, join_by: Union[str, bytes] = '\n') ExtractorsT ¶
Extract strings from state
- Parameters
length (int, optional) – Min length of string. Defaults to 4.
join_by (str, optional) – String to join by. Defaults to newline.
- Returns
The Chepy object.
- Return type
Examples
>>> Chepy("tests/files/hello").load_file().extract_strings().o __PAGEZERO' __TEXT' __text' __TEXT' __stubs' __TEXT' ...
- extract_stripe_api() ExtractorsT ¶
Extract Stripe standard or restricted api token
- Returns
The Chepy object.
- Return type
- extract_twilio_api() ExtractorsT ¶
Extract Twilio API key
- Returns
The Chepy object.
- Return type
- extract_twilio_sid() ExtractorsT ¶
Extract Twilio account or app sid
- Returns
The Chepy object.
- Return type
- extract_urls(is_binary: bool = False) ExtractorsT ¶
Extract urls including http, file, ssh and ftp
- Parameters
is_binary (bool, optional) – The state is in binary format. It will then first extract the strings from it before matching.
- Returns
The Chepy object.
- Return type
- extract_zero_width_chars_tags() ExtractorsT ¶
Extract zero width characters between U+E0000 to U+E007F. Implements https://www.irongeek.com/i.php?page=security/unicode-steganography-homoglyph-encoder
- Returns
The Chepy object.
- Return type
- find_continuous_patterns(str2: Union[str, bytes], min_value: int = 10) ExtractorsT ¶
Find continius patterns between the state as a string and the provided str2
- Parameters
str2 (Union[str, bytes]) – String to find matches against
min_value (int, optional) – Minimum value of continuous matches. Defaults to 10.
- Returns
The Chepy object.
- Return type
- find_longest_continious_pattern(str2: str) ExtractorsT ¶
Find longest continuous pattern
- Parameters
str2 (Union[str, bytes]) – String to find match against
- Returns
The Chepy object.
- Return type
- javascript_comments() ExtractorsT ¶
Extract javascript comments
Some false positives is expected because of inline // comments
- Returns
The Chepy object.
- Return type
- xpath_selector(query: str, namespaces: str = None)¶
Extract data using valid xpath selectors
- Parameters
query (str) – Required. Xpath query
namespaces (str, optional) – Namespace. Applies for XML data. Defaults to None.
- Returns
The Chepy object.
- Return type
Examples
>>> c = Chepy("http://example.com") >>> c.http_request() >>> c.xpath_selector("//title/text()") >>> c.get_by_index(0) >>> c.o "Example Domain"