I think this is a pretty niche demand and probably another topic for r/DataHoarder but anyway, here I am.
I created this application to basically have a way to store my WhatsApp messages away from the Google/Meta servers. Or at least not depend so much on Google backup.
Whatsapp has a very limited export functionality, which any user can use through the app’s own interface. Once these messages and media have been exported, you can place them in a folder monitored by ChatVault, send them to an email monitored by ChatVault or upload them via the interface. Once ingested by chatvault, it will record the chat media on disk and save the messages in a database in a structured way. These messages can be accessed in a front end similar to a chat application.
It’s still under development, some things need to be improved (mainly the UI), it’s still far from ideal, it’s true, the way Whatsapp allows us to export messages is quite bad, which makes the entire process of exporting and ingesting it into chatvault quite coupled but it can still be useful for those who want to store their messages independently, just like I wanted.
https://github.com/vitormarcal/chatvault
Edit: add an application interface image
Awesome work!
Looks very promising! Thanks :)
I worked on a similar project for a long time…
I don’t really have code that I can share right now, but perhaps I can offer some hints.Instead of exporting chats from the app (since the function is too limited) I decrypted the backup database that WhatsApp creates. This was originally possible only on rooted Android phones, but it’s now possible if the user has chosen to encrypt the backup (and the encryption key is known).
Once the database is decrypted, it is just a matter of extracting messages and media information from it, and rebuild the chat. One of the problems here (and one of the reasons why I could not keep spending time on this project) is that Meta changed the structure of the database multiple times over the last years.
Here you can see an example of a chat I recreated from the database. The display format is HTML, trying to replicate the WhatsApp original look.
Let me know if you want more info. I will be happy to share!
I worked on a similar project for a long time…I don’t really have code that I can share right now, but perhaps I can offer some hints.
Instead of exporting chats from the app (since the function is too limited) I decrypted the backup database that WhatsApp creates. This was originally possible only on rooted Android phones, but it’s now possible if the user has chosen to encrypt the backup (and the encryption key is known).
Once the database is decrypted, it is just a matter of extracting messages and media information from it, and rebuild the chat. One of the problems here (and one of the reasons why I could not keep spending time on this project) is that Meta changed the structure of the database multiple times over the last years.
Here you can see an example of a chat I recreated from the database. The display format is HTML, trying to replicate the WhatsApp original look.
Let me know if you want more info. I will be happy to share!
EDIT: It seems I cannot post a picture in the comment. Is that a limitations of this community?
I didn’t know you could access these messages without root these days, good to know!
I was already aware of these frequent structure changes, so I went with the simplest as a proof of concept.
Anyway, I plan to study a little more to find out how the Whatsapp base works, I don’t rule out an experimental import of this type.
Thanks for the information! :D
This is great! I will definitely try this out when I get the chance and follow development.
For the last year or so I had been thinking of a similar thing and planning to write a post on r/datahoarder about it, but decided not to because as someone who knows nothing about software development I would be basically just asking people to create an app I want for free.
Here are some random thoughts I had had for that app:
Import from multiple sources WhatsApp/Signal/Telegram/etc - either from databases,backups, exports, or just parsing text files for if you are able to compose a chat history in any other way. Optional chat merge - merge chats from different sources into a single chat history instead of into a separate chat isothere
Address book/Contact - so that you can manage things clearly when mergingIf anyone searches for a comparable alternative for Signal, I can recommend https://github.com/bepaald/signalbackup-tools. You can export a backup in Signal Android and decrypt it and export HTML with that cli. It even offers search via Javascript.
How does the importing work? I have a whatsapp folder containing logs.zip, Databases, Backups, Media. I put the whole folder into the import folder of chatvault but execute disk import does not seem to do anything
I’m also looking forward to a new way to import as like you, I manually backup the complete WA folder from my phone and thus have never lost any messages as well as been reliant on the Google Drive backups.
For now, you enter the Whatsapp interface, go to export chat, choose whether you want to send it by email or on disk. With these files you can configure import via email, import through the interface with the zip file or place them in a specific folder, so that you can trigger the disk scan through the chatvault interface.
Couldn’t get it to work. Tried running it in docker. Getting this error
:: Spring Boot :: (v3.1.2) 2023-11-15T17:10:53.882Z INFO 1 — [ main] dev.marcal.chatvault.Boot$Companion : Starting Boot.Companion v0.0.1-SNAPSHOT using Java 17.0.9 with PID 1 (/app/chatvault.jar started by root in /app) 2023-11-15T17:10:53.976Z INFO 1 — [ main] dev.marcal.chatvault.Boot$Companion : No active profile set, falling back to 1 default profile: “default” [Too many errors, abort]
I just cleared all images and containers to make sure I wasn’t working with any cache and that everything went fine.
This message is expected because we are not defining any profiles: No active profile set, falling back to 1 default profile: “default”.
Then I would have to see what error happened after that. The only properties required are the database connection properties.That being said, you can run compose.yml in the project root.It will build an image locally.Or replace the
build: ./
line with theimage: ghcr.io/vitormarcal/chatvault:latest
docker-compose -f compose.yml
I tested these two ways here and they continue to work. Have you added database information?
I tried with this setup only. This was my docker setup.
chatvault: image: ghcr.io/vitormarcal/chatvault:latest restart: unless-stopped environment: - SPRING_DATASOURCE_URL=jdbc:postgresql://postgres:5432/chatvault - SPRING_DATASOURCE_USERNAME=${POSTGRES_USER} - SPRING_DATASOURCE_PASSWORD=${POSTGRES_PASSWORD} ports: - 8106:8080 volumes: - ‘~/chatvault:/opt/chatvault’ - ‘~/chatvault/config:/config’ depends_on: - postgres
I’m sorry about that, is your Operating System Unix or Windows? x86 or arm? I tested it on Ubuntu server and fedora and it is correct, I will test it on Windows soon. I’m trying to imagine what it could still be.
It was ARM
Hey, I’ll love to use (and donate) too. Thanks for building this
This is an interesting project, I’ll keep an eye on this for sure. Does anyone know any similar projects that are more general not only restricted to WhatsApp?
Awesome project!
Sorry for the noob question but how do I export the chat? I mean, when I hit export chat button it only gives me option to send it via email (and them attaches a .txt and the .jpgs - not a zip file) or share it with telegram, no option to save in disk.
Please don’t take this as criticism, this is a great idea and I fully plan on contributing to the codebase.
With that said I spent a few hours trying to get it to work. No luck. Docker, no. Docker compose. No.
I took the code and built / run manually. That worked but then I couldn’t import a chat. I tested with one line with no attachments. From just that one line, here are the problems so far:
- it doesn’t seem like WhatsApp has a standard way of exporting the text file. Your text file and my text file are different. In the US the format is [datetime] name msg. In your file it’s different and so it breaks the moment it hits the [.
- unfortunately not accounting for locale. US stupidly uses mm/dd/yy. You have hardcoded the formatter for dd/mm/yy. Maybe you need to have a locale selection in the UI before import. Without that, no US messages are coming in.
- it doesn’t account for 6/6/23. It’s expecting 06/06/2023. Again formatter and padding can fix that.
- ui creates an entry in chats table for every attempt regardless of if a message was imported.
- exceptions in other languages.
- missing tests for the stuff above
Again none of that is supposed to be criticism. This is a great idea and I fully intend to help out with it.
Good job!
Hello, criticism is certainly welcome!
If you can open an issue on github it will be easier for me to follow, as I may not see these comments.
About the message date, you are right! Until I divulge the project, I was the only user so I didn’t know it could have multiple types of message date formats, then I developed for a specific format (which isn’t even uncommon).
Someone had already warned me about this and so I started to develop something that could format the date message in a rigth way. It’s almost ready, but unfortunately, there are dates that
end up being ambiguous like 01/01/2023 and it’s not possible to infer the correct format, so I’ll probably have to create an environment variable for that, which I really didn’t want.
If you look on github I opened a bug issue for this (although maybe it’s not really a bug but rather an improvement, because it works but with a specific format) and in the Github’s Projects part it’s already under development.
Regarding the duplication of data, I mentioned it in a comment here on this post, but perhaps I should have made it clearer. Anyway, as I said in the post, the project is still far from ideal, despite it working very well for my use case (every import I do is always new messages).
Anyway, I know this is a very important point, so I created a way of deduplication considering the last message in the database as a cutoff parameter. This is already in the latest version of the docker image.
Regarding docker, more information about the error is welcome, you weren’t the first person to talk about it but I couldn’t replicate the problem. I tested it on Fedora and the Ubuntu server, I built it locally, I pulled it from the Registry, did docker system prune --volumes -a, and it still worked as expected.
This is definitely what i was looking for the whole time. Thank you for this project.
I have a question though. After setting it up on docker, i tried to import a chat onto it, but nothing happens. I only get a “file was imported” and a bit of loading, but its still empty.
Any advice?
Good find on my side, I’ve been looking for some way of archiving whatsapp messages, I’ll take a look
THANK YOU!!!
I have 15+ GB of Whatsapp stuff and it is unmanageable. My best friend has 25 GB if I remember correctly. This stuff are chats and media from about 10 years ago until today.
ChatVault is the archiving solution I’ve been dreaming for YEARS, so I (we) can finally offload most of the old chats from our mobile devices, while still keeping them accessible via a frond end.
I will follow this closely. Please also consider accepting donations - I will definitely donate if ChatVault fits my use, this is a game changer!
I have a WhatsApp backup zip file. This won’t work, right? Since I think it’s encrypted.
For now, the application cannot deal with the physical base of WhatsApp, the application can import based on the txt of messages and media that WhatsApp generates when exported through the application interface