Password cracking: Difference between revisions

From Rixort Wiki
Jump to navigation Jump to search
No edit summary
Line 32: Line 32:
* Simple combinations, such as dictionary word concatenated with '1', '123' etc.
* Simple combinations, such as dictionary word concatenated with '1', '123' etc.
* Every possible combination of case and 0-9a-z from 6-12 characters in length.
* Every possible combination of case and 0-9a-z from 6-12 characters in length.
== Performance ==
Vectorisation vs concurrency vs parallelisation.


== Languages ==
== Languages ==


Language choice is a combination of speed and available libraries. Obvious initial candidates are:
Language choice is a combination of performance, available libraries and existing knowledge. Obvious initial candidates are:


* C
* C

Revision as of 17:02, 24 July 2018

Initial steps

Steps required for password cracking software:

  1. Identify which columns contain the username and the password (hashed or otherwise). May be easier to convert to a standard internal representation before processing.
  2. Identify the algorithm used.
  3. Identify whether a salt is used.

From these there are multiple stages:

  1. If no salt is used (e.g. plain MD5), consult a pre-computed lookup table.
  2. If small salt is used (less than say 4 bits), consult a pre-computed lookup table.
  3. If a sensible algorithm is used (e.g. bcrypt with large salt), check dictionary, then common words, then variants of the previous two, then brute force.

Identifying an algorithm

  • Length: 32 characters (16 bytes) is likely to be MD5.
  • Characters: 0-9a-fA-F is likely to be MD5.

Lookup tables

  • How should these be delivered? Plain text file, SQLite database, Lightning Memory-Mapped Database (LMDB), something else?
  • What options does the chosen language support?
  • Which options are the most efficient?
  • Can lookup tables be built entirely in memory and then flushed to disk? Regular flushing as used by SQLite prevents data loss but may take longer due to regular I/O. (answer: Yes, just put the whole thing in a huge transaction and commit at the end).
  • Trade-off between size of table (and time to generate) and coverage. May not be worthwhile building lookup tables for anything more than dictionary words and common passwords.

Possible contents of lookup tables:

  • Dictionary words
  • Common words not in dictionary (e.g. TV shows)
  • Simple combinations, such as dictionary word concatenated with '1', '123' etc.
  • Every possible combination of case and 0-9a-z from 6-12 characters in length.

Performance

Vectorisation vs concurrency vs parallelisation.

Languages

Language choice is a combination of performance, available libraries and existing knowledge. Obvious initial candidates are:

  • C
  • CPython (reference implementation of Python)
  • PyPy (Python written in Python - supposedly faster than CPython but sometimes behind in terms of version support)

Libraries

Ultimately most crypto libraries end up being a wrapper around OpenSSL.

Python