Extractors

class chepy.modules.extractors.Extractors(*data)
css_selector(query: str)

Extract data using valid CSS selectors

Parameters

query (str) – Required. CSS query

Returns

The Chepy object.

Return type

Chepy

Examples

>>> c = Chepy("http://example.com")
>>> c.http_request()
>>> c.css_selector("title")
>>> c.get_by_index(0)
>>> c.o
"<title>Example Domain</title>"
decode_zero_width(_zw_chars: str = '\u200c\u200d\u202c\ufeff') ExtractorsT

Extract zero with characters. Decode implementation of https://330k.github.io/misc_tools/unicode_steganography.html

Parameters

chars (str, optional) – Characters for stego. Defaults to ‘‌‍‬’.

Returns

The Chepy object.

Return type

Chepy

extract_auth_basic() ExtractorsT

Extract basic authentication tokens

Returns

The Chepy object.

Return type

Chepy

extract_auth_bearer() ExtractorsT

Extract bearer authentication tokens

Returns

The Chepy object.

Return type

Chepy

extract_aws_keyid() ExtractorsT

Extract AWS key ids

Returns

The Chepy object.

Return type

Chepy

extract_aws_s3_url() ExtractorsT

Extract AWS S3 URLs

Returns

The Chepy object.

Return type

Chepy

extract_base64(min: int = 20) ExtractorsT

Extract base64 encoded strings

Parameters

min (int, optional) – Minimum length to match. Defaults to 20.

Returns

The Chepy object.

Return type

Chepy

extract_domains(is_binary: bool = False) ExtractorsT

Extract domains

Parameters

is_binary (bool, optional) – The state is in binary format. It will then first extract the strings from it before matching.

Returns

The Chepy object.

Return type

Chepy

extract_dsa_private() ExtractorsT

Extract DSA private key

Returns

The Chepy object.

Return type

Chepy

extract_email(is_binary: bool = False) ExtractorsT

Extract email

Parameters

is_binary (bool, optional) – The state is in binary format. It will then first extract the strings from it before matching.

Returns

The Chepy object.

Return type

Chepy

Examples

Sometimes, the state is in a binary format, and not readable. In this case set the binary flag to True.

>>> Chepy("tests/files/test.der").load_file().extract_email(is_binary=True).o
extract_facebook_access_token() ExtractorsT

Extract Facebook access tokens

Returns

The Chepy object.

Return type

Chepy

extract_github() ExtractorsT

Extract Github access token

Returns

The Chepy object.

Return type

Chepy

extract_google_api() ExtractorsT

Extract Goolge api keys

Returns

The Chepy object.

Return type

Chepy

extract_google_captcha() ExtractorsT

Extract Goolge captcha keys

Returns

The Chepy object.

Return type

Chepy

extract_google_oauth() ExtractorsT

Extract Goolge oauth keys

Returns

The Chepy object.

Return type

Chepy

extract_hashes() ExtractorsT

Extract md5, sha1, sha256 and sha512 hashes

Returns

The Chepy object.

Return type

Chepy

Examples

>>> Chepy(
>>>     ["60b725f10c9c85c70d97880dfe8191b3", "3f786850e387550fdab836ed7e6dc881de23001b"]
>>> ).extract_hashes()
{'md5': [b'60b725f10c9c85c70d97880dfe8191b3'], 'sha1': [b'3f786850e387550fdab836ed7e6dc881de23001b'], 'sha256': [], 'sha512': []}
extract_html_comments()

Extract html comments

Returns

The Chepy object.

Return type

Chepy

extract_html_tags(tags: List[str])

Extract tags from html along with their attributes

Parameters

tag (str) – A HTML tag

Returns

The Chepy object.

Return type

Chepy

Examples

>>> Chepy("http://example.com").http_request().html_tags(['p']).o
[
    {'tag': 'p', 'attributes': {}},
    {'tag': 'p', 'attributes': {}},
    {'tag': 'p', 'attributes': {}}
]
extract_ips(is_binary: bool = False) ExtractorsT

Extract ipv4 and ipv6 addresses

Parameters

is_binary (bool, optional) – The state is in binary format. It will then first extract the strings from it before matching.

Returns

The Chepy object.

Return type

Chepy

extract_jwt_token() ExtractorsT

Extract JWT token

Returns

The Chepy object.

Return type

Chepy

extract_mac_address(is_binary: bool = False) ExtractorsT

Extract MAC addresses

Parameters

is_binary (bool, optional) – The state is in binary format. It will then first extract the strings from it before matching.

Returns

The Chepy object.

Return type

Chepy

extract_mailgun_api() ExtractorsT

Extract Mailgun API key

Returns

The Chepy object.

Return type

Chepy

extract_paypal_bt() ExtractorsT

Extract Paypal braintree access token

Returns

The Chepy object.

Return type

Chepy

extract_rsa_private() ExtractorsT

Extract RSA private key

Returns

The Chepy object.

Return type

Chepy

extract_square_access() ExtractorsT

Extract Square access token

Returns

The Chepy object.

Return type

Chepy

extract_square_oauth() ExtractorsT

Extract Square oauth secret token

Returns

The Chepy object.

Return type

Chepy

extract_strings(length: int = 4, join_by: Union[str, bytes] = '\n') ExtractorsT

Extract strings from state

Parameters
  • length (int, optional) – Min length of string. Defaults to 4.

  • join_by (str, optional) – String to join by. Defaults to newline.

Returns

The Chepy object.

Return type

Chepy

Examples

>>> Chepy("tests/files/hello").load_file().extract_strings().o
__PAGEZERO'
__TEXT'
__text'
__TEXT'
__stubs'
__TEXT'
...
extract_stripe_api() ExtractorsT

Extract Stripe standard or restricted api token

Returns

The Chepy object.

Return type

Chepy

extract_twilio_api() ExtractorsT

Extract Twilio API key

Returns

The Chepy object.

Return type

Chepy

extract_twilio_sid() ExtractorsT

Extract Twilio account or app sid

Returns

The Chepy object.

Return type

Chepy

extract_urls(is_binary: bool = False) ExtractorsT

Extract urls including http, file, ssh and ftp

Parameters

is_binary (bool, optional) – The state is in binary format. It will then first extract the strings from it before matching.

Returns

The Chepy object.

Return type

Chepy

extract_zero_width_chars_tags() ExtractorsT

Extract zero width characters between U+E0000 to U+E007F. Implements https://www.irongeek.com/i.php?page=security/unicode-steganography-homoglyph-encoder

Returns

The Chepy object.

Return type

Chepy

find_continuous_patterns(str2: Union[str, bytes], min_value: int = 10) ExtractorsT

Find continius patterns between the state as a string and the provided str2

Parameters
  • str2 (Union[str, bytes]) – String to find matches against

  • min_value (int, optional) – Minimum value of continuous matches. Defaults to 10.

Returns

The Chepy object.

Return type

Chepy

find_longest_continious_pattern(str2: str) ExtractorsT

Find longest continuous pattern

Parameters

str2 (Union[str, bytes]) – String to find match against

Returns

The Chepy object.

Return type

Chepy

javascript_comments() ExtractorsT

Extract javascript comments

Some false positives is expected because of inline // comments

Returns

The Chepy object.

Return type

Chepy

xpath_selector(query: str, namespaces: str = None)

Extract data using valid xpath selectors

Parameters
  • query (str) – Required. Xpath query

  • namespaces (str, optional) – Namespace. Applies for XML data. Defaults to None.

Returns

The Chepy object.

Return type

Chepy

Examples

>>> c = Chepy("http://example.com")
>>> c.http_request()
>>> c.xpath_selector("//title/text()")
>>> c.get_by_index(0)
>>> c.o
"Example Domain"