Brute Force BIP39 Passphrase Recovery. (25th Word, Hidden Wallet) Trezor, Keepkey, Ledger


so today we’re just going to do a short
follow-up on a video that I’d previously done on passphrase recovery so this is
the process you can use to recover your BIP39 passphrase if you’ve
forgotten it or maybe it’s called your twenty-fifth word or the password for
your hidden wallet depending on the hardway wallet you’re using and all of
the notes for what you need to start this time are the same as my last video
on it so you you will need to have a correct twenty four word seed you will
need to know one address that was used with your wallet so let’s say for all of
these examples you know we’d purchase some Ethereum on coinbase we looked at
our emails and we logged into coinbase to see where we had sent it to like all
the other ones you need to download BTC recover create an air-gapped environment
if you under the securely and I’ve got some videos that show you how to do that
so once you’ve set up your environment there are two general approaches we’re
going to look at and the first one is basically a 100% brute force based
approach so that’s going to be where we have a token file that’s full of just
individual letters and we’re just putting them together in different
combinations to brute force a password and that works okay for shorter
passwords which some people might have used particularly if you’re just
interested in like a plausible deniability
style passphrase setup the second thing we’re to look at a different types of
dictionary attacks and these can work effectively for long complex past
phrases we can use a few different types of dictionaries with different types of
words in them and we can also start to try different combinations of words in a
sort of brute force style so we’re gonna look at two different ways that we can
use that so firstly via a password list file so that’s where it just runs
through the dictionary basically from start to finish and you know it could be
a dictionary list it could be a list of frequently used English words it could
be a password dump that’s been leaked or compiled from previous website hacks so
there’s like a rockyou list and a PHPBB list that we’ll be referencing later
which can be really useful because you know they have a lot of passwords that
people commonly pick and you know it could be something like using the diceware lists if you’d used diceware or even just making a list that’s full
of you know names family members it’s passwords you often use all those
sorts of things and you can download some of them from here and that website
will be in the description so the other thing we’re going to look at really
quickly is the way that you can use dictionaries sort of in a brute-force
kind of way so you can look for multiple words that have been connected together
and this does hit practical limits once the dictionary start getting quite large
you know for example I’ve said here you know you can recover something that’s a
three dice words off the short word list that they provide but you know any more
than that things start getting time-consuming and the important thing
though is if you look at my previous video there are rules that you can give
around these token files will speed things up dramatically so you can help
give BTC recover some guidance around the kinds of characters that might only
appear between words might only appear at the start might be at the finish and
it comes down to I guess you thinking about the kinds of passwords and the
kinds of phrases that you would have strung together the other thing that’s
really important to consider when you’re doing brute-force attacks like this is
the kind of hardware you’ve got at your disposal and for this these tests I was
using a mix of dual core and quad core i5 processors and also spun up a 48 vCPU
Linode and the reason I added that Linode in there is that you know you can
buy as a consumer processor now these Ryzen and threadripper CPUs and when I
ran the test and I’ll show you in a sec you know this Ryzen process would be
about ten times faster than the i5 that I was using in these tests and they can
make a huge difference if you have something that might say it’s going to
take you a year to brute-force on like an i-5 you know you could be looking at
a month with some of these more high-end processes and that’s before we even
consider GPUs or FPGA s so you can see here that the performance is pretty
consistent so you know this is running the same test on the laptop so it had an
ETA of two days the desktop and ETA of one day and as you can see it was fully
utilizing all the cores and then on the Linode we can see it was going to take
a whole grand total of three hours with all 48 vCPUs just running flat out so
that’s an important thing to consider when you’re just looking at these
numbers and trying to work out what is doable in your situation
so we’re going to look at some examples and for all of these examples we use
this same 24 word seed phrase here and we’re trying to recover some different
addresses so the first one is looking at a short password that we’re just going
to brute-force so this is a short four character password I did do tests of you
know one two and three character passwords with our brute force so
quickly that it wasn’t even worth showing so this is one where we’re
looking for the word coin and these are the commands we’re using with BTCrecover here so I’ve not worried about any typos for BTC recover to do it’s
just going to run through all the different combinations of letters and
not worry about uppercase or lowercase and we can see that it worked it out in
a minute and the full run would have taken an hour and that’s the command
there all of these commands will also be in the description down the very bottom
as well if you want to copy and paste them and try them yourself a second
similar test is saying or maybe we weren’t sure about some capitalization
so we would have run the same sort of test again looking with the same token
list and we are didn’t hear this typos case and typos once so it’s going to do
all the tests before but also going to run through and capitalize the different
words and we can see that that test took five minutes to run now the thing that’s
really important to understand when we’re using BTCrecover this way is we
need to tell BTCrecover how many letters we’re wanting to try string
together and the way we do that is by using this Max tokens command here so
basically we’ve said max tokens 4 and because we’re using that with a token
file that’s just individual letters that means it’s going to try a maximum of
four letters together likewise here again that’s still only four tokens that
were using together whereas when we get to some of these ones where we’re
chaining together sets of tokens out of dictionaries we need to tell it how many
individual words we want to string together so that only matters when we’re
using a token list not a password list and we can see here that was the five
letter recovery so we had five tokens and likewise with these later tests
where we’re stringing say two and three words together we’re setting the max
tokens to be two for the two word one and three
for the other you really need to make sure you’re setting this max tokens
command otherwise you’re just gonna end up with like stupidly long and complex
tests almost straight away the other thing you’ll notice is again
this one down the bottom here for three words strung together you know it would
have taken a day on the desktop CPU and three hours on like a Ryzen threadripper or a Linode 48 core system so the token list that I used for
example 1 & 2 is really really straightforward it’s just basically a
text file so tokens-BF-double.txt with just one letter per line and it’s just a to Z
0 to 9 and I’ve just gone through the alphabet twice and again you just need
to work out whether using dictionary words or using completely random words
and letters and whether you’re including maybe special symbols in there or not
because again the more of these you add in the quicker you’re going to get into
the realm of pass phrases that are difficult to brute-force but again if
you’re thinking you were likely just to use you know normal alphanumeric sort of
stuff then you know a list like this can be perfectly acceptable for that so the
other thing to consider is if you’re wondering which letters and which
numbers and which symbols to include on here it’s worth having a look at your
hardware wallet and seeing what it supports so for example the Trezor
allows you to just enter in with your keyboard on your computer the passphrase
every time and you can use just about whatever symbols you like whereas with a
ledger wallet because you’re entering it all on the device all the time they
actually only offer a much smaller set of symbols than is actually possible
within BIP39 passphrases so yeah just have a look and see what are the valid
passphrase you could have even entered in in the first place because it might
be a smaller character set than you think
so we’ll just go into example 3 and example 4 and essentially what we’re
looking at here is sort of where you get to that boundary between where a
brute-force approach will work versus a dictionary approach and we can see here
for this next test we were looking for the word Smith you know super common
last name there’s no reason why someone might not have just picked that as their
passphrase and we’re still using that same tokens bf let’s double that I
talked about earlier we’re also saying well maybe there’ll be
a cap some capitalization in there because it’s a last name so we run that
test and this is where we have a few different things so firstly I ran it via
brute force with that same dictionary we see before it took you know four five
hours but it could have taken five days bearing in mind this is on an i5 desktop
so what you could achieve with a you know multi-core monster is a lot more
would have taken a lot less time about a tenth of the time we can also see that
on a dictionary attack this is now we’re using English txt so this is just an
English dictionary that I got that took about 32 minutes to run and the other
dictionary that I used for this one it was his password list is this password
list called RockYou comes with some various security focused distributions
of Linux and you can see it’s basically a whole list of passwords that people
have been using on websites and things that were leaked or hacked and it
includes some fairly long and complex passwords that you know you’d be really
unlikely to get using you know brute force based approaches but you know this
this has got like huge huge number of passwords in there and you know if you
just happen as lots of humans do to have picked something that you thought was
really random that just happens to be the same thing that a whole bunch of
other people just picked because they thought it was random you might very
well find that you’re super duper secure password is actually on this list and
it’s really worth just checking it as well because who knows you may have just
thought the same as many many other hundreds of people before you in picking
a password so that’s RockYou so we can see that rock you found Smith as well
capitalized in a few seconds and so what you can see we’ve done is also for this
password list you can use that in conjunction with having typos that BTC
recovered we’ll check so what we’ve done here is we’ve said for each row in the
password list file it’s going to assume that two of those letters might have
been an uppercase letter and doesn’t just have to be the first letter
or the last letter, any two, I’m just going to run through and test all of those so we can
see an example here of that dual capitalization happening here with the
password’s a bit longer and just YouTube for the sake of it and we can
see that we ran it with this command against the rockyou list and that’s
what it came up with so it found it in just a couple of minutes but the full
length run could have taken a couple of days so I thought we’d do another test
as well which was just the correct horse battery staple running that against the
tokens English list just just to demonstrate that but that ran out of
memory and didn’t work so it’s important to realize that once you start using a
big long dictionary as a token list you very quickly start hitting both memory
and computation limits so three megabyte token list file is just too long for
that sort of thing and we’ll see the practical limits of that in a minute the
other two examples we have our examples where we’ve used a dictionary list so in
this one we’ve used the token list zero to a thousand and had a password that
might have strung two words together that are on that list so correct
question and we can see here that that ran and it found the result in a couple
of minutes and basically on this one though it didn’t take nearly as long
because we weren’t bothering to check for capitalization or anything like that
we’re trying to illustrate connecting multiple tokens together usually the
last example that I’m going to show is it’s basically looking at three words
together and we’re using the Google top 500 English words for this one just to
make the dictionary a bit shorter just to illustrate the point and we can see
that that ran on the laptop it took about two days to run so that would have
taken a day on the desktop and three hours on the Ryzen CPU and I’m gonna
do another video on this but I think it really illustrates the importance of
using a long dictionary if you’re going to be just chaining words together if
you’re using a short dictionary file like even the dice where short list you
really need to select a decent number of words out of there before an attack on
that phrase becomes impractical that’s probably a really good place to talk
about the computational limits for some of this stuff and it’s important to
understand that possible passwords when using a password list like RockYou or
something like that they scale in a linear
fashion if you’re using a password list that’s twice as long it’ll take twice as
long to run if you’re using a password list that’s ten times as long it’ll take
ten times as long to run so time is basically the speed of your system times
the list length once you start using a token list the possible time for it to
process scales exponentially so what that means is that if you double the
size of the list the processing could potentially take four times as long if
you increase the size of the token list by ten the test will take 100 times
longer to run the other thing to understand is the token list increases
exponentially with the number of tokens you are looking for so for example a
four word password with a hundred word dictionary will have 100 to the power of
four options a five word password with a hundred word dictionary will be 100 to
the power of five so what we can see is that adding one more word by increasing
the length of a phrase by 25% it will take 100 times as long to be processed
I’m gonna look more into that my next video that looks at selecting a good BIP39 passphrase so my suggested workflow for this would
be number one to set up the recovery environment to make sure that you know
how to make it work and to try running through some of the examples that are
provided here and the reason why I say that that’s a good thing to do is just
to make sure you don’t waste a whole month running a test that’s actually
never going to find the result you want simply because you’ve got a typo
somewhere or have misunderstood some of the arguments for BTCrecover i’m also
going to include the password lists that i used for this on github so you can
actually reproduce all of the tests that i did here just to make sure you’ve got
a firm grasp on how to use these tools before you start using them on your own
wallets and phrases my suggestion is again if you’ve got no
idea what your passphrase was start with trying to brute-force something short
because you know you might have just picked a password that was easy to
remember that was mostly about plausible deniability and you can create a token
list with letters in it or I just download the one I’ve put on github my
next suggestion would be to try a couple of different dictionaries just for
dictionary attacks like the RockYou lists just as a password list and also if that
doesn’t give to you a result create yourself a token list with every
password every family member name every nickname friends pets all of those
things and stuff that you might use in
passwords and start doing a multiple token sort of attack on that so if
you’re someone who might have strung together two or three of those kinds of
things to make a password that can be a really common thing and you have to look
through the password lists in things like RockYou to see that a lot of
people do that the other thing I’d say is we really selective about the kinds
of typos that you wanted to try and have BTCRecover check in my previous video
we’re sort of assumed that you had a pretty good idea about what your
passphrase was and you were really just trying to recover from a typo or a
slight misremembering or something like that whereas if you’re trying to do
brute force and dictionary based stuff having BTCRecover run through and like
check the missing letters check for letters that haven’t pressed twice check
for caps lock and all that sort of stuff can be extremely time-consuming so just
be really selective about the kinds of typos you want BTC recover to check the
other note that I think I would add is concerning cloud servers you really need
to understand the risks so if you’re presented with a test that you are
pretty sure will recover your passphrase but it’s going to take months you know
paying for a cloud-based server or finding someone or buying a high-end CPU
to run these tests can be cost effective I just use Linode and I’ve put a link
to them in the description it’s just important that you understand that the
process of using BTC recover exposes both your 24 word seed and
potentially a passphrase onto the system you’re running them on so if you do use
a cloud-based server to do these kinds of recoveries one of your first
priorities should be to move all of your funds onto a brand new wallet so brand
new 24 word seed brand new passphrase because you should assume that the ones
that were exposed to this hired server have been compromised or at least could
very easily be in the future I think it’s important to acknowledge from the
outset that trying to do a brute force recovery with your passphrase should be
considered a fairly low odds play and it’s not something that has any
guarantees of success at all but it is also a good thing to do just to help
satisfy yourself that you’ve done everything you can to try and recover
your funds thanks for watching I hope that was
helpful just hit subscribe if you’d like to be kept in the loop about future
content I make to help people stay safe in the crypto space and to recover if
they get trouble or if there’s a question you’d
like some more information about or topic you’d like me to cover in the
future just leave a reply

2 thoughts on “Brute Force BIP39 Passphrase Recovery. (25th Word, Hidden Wallet) Trezor, Keepkey, Ledger”

  1. Hi . Thanks . Your video is very informative , however it might be helpful to define the difference between a password and a passphrase ?. IE In your opening words displayed on the screen you mention short passphrases 5 very doable 6 chars is the limit , isnt that a password ? or does 6 "chars" mean six words as suggested by the word passphrase in the description ? Thanks . kr

Leave a Reply

Your email address will not be published. Required fields are marked *