Commit Diff


commit - e1044d63e638a01d3932fae080c42777125866ad
commit + e3ac3416ae8b8486d01a96d002ddfaa4915f0630
blob - eab0488892e0368ebc877d822b649d837d5aa203
blob + a61fba6189cfe00cd7a7920379123fac7ca328bc
--- README.md
+++ README.md
@@ -6,31 +6,31 @@ Let me first remind you what such application can do. 
 
 As very well explained on the blog of [Solene](https://dataswamp.org/~solene/2017-03-17-integrity.html) a bitrot detect the rotation of one bit on your storage device. This could a Hard disk, a SSD, an USB or a CD. 
 
-Indeed with the time some bis are loosing their values and become "rotated". This could damage your files if you do not take cares of it. 
+Indeed with the time some bits are loosing their values and become "rotated". This could damage your files if you do not take cares of it. 
 
 So, in short a bitrot application is looking if the checksum of unmodified files are well equal. If not, they are in bad shape and a restore from backup is more than welcome for the impacted file.
 
 
 # Why a new bitrot application ?
 
-I'm using a NAS server (running on OpenBSD) to store all important information of several users. This is a disk of 1TB which is backuped regularly on a 2nd external disk. Indeed I do not want to have both PROD and backup locate on the same place, they could be damaged both by fire, water, ... 
+I'm using a NAS server (running on OpenBSD) to store all important information of several users (included me). This is a disk of 1TB which is backuped regularly on a 2nd external disk. Indeed I do not want to have both PROD and backup located on the same place, they could be damaged both by fire, water, ... and I will loose all my data. 
 
 This NAS is heavily using the concept of "time machine" and do lot of harlinks between each "versions" of the local backups ([details here](/post/post_20160724)) 
 
-I'm not using a smart application to backup my data from PROD to Backup. I'm just doing disk copy with the command "pax -rw". So bad bits from my PROD machine will be copied to the backup, with very bad consequences. So, I need a bitrot application that could guarantee that all my files are in good shape before copying them on the backup disk. 
+I'm not using a smart application to backup my data from PROD to Backup. I'm just doing disk copy with the command "pax -rw". So, if I have some, bad bits from my PROD machine will be copied to the backup, with very bad consequences. So, I need a bitrot scanner that could guarantee that all my files are in good shape before copying them on the backup disk. 
 
-Why write an other bitrot application ?
+# Why write an other bitrot application ?
 
 Indeed, we can find several bitrot app on internet. For example, you can find [bitrot in python](https://github.com/ambv/bitrot), [bitrot in javascript](https://github.com/laktak/chkbit), [bitrot in go](https://github.com/marcopaganini/bitrot), [bitrot scanner in go](https://github.com/kormoc/bitrot-scanner)
 
 In total I've found more than 10 bitrot applications. Unfortunately none are matching important elements for me:
 
-- use a very light checksum algorithm. Lot are using sha1 or md5, but here our need is just to detect integrity of a file. It's quite improbable that the missing bit we could have do not impact the checksum. I do not want to spend more than 3 hours on my small NAS server for such task. 
-- be smart and take into account hardlinks. Indeed, all of the bitrot app I've tested are based on filename. So, if you have 10 different files pointing to the same Inode, such tools will scan 10x this file and perform 10x the checksum. No really efficient for my case. 
+- use a very light checksum algorithm. Lot are using sha1 or md5, but here our need is just to detect integrity of a file, and not hijack of a file. It's quite improbable that the missing bit we could have do not impact the checksum (what a hacker try to perform). I do not want to spend more than 3 hours on my small NAS server for such scan. 
+- be smart and take into account hardlinks. Indeed, all of the bitrot app I've tested are based on filenames. So, if you have 10 different files pointing to the same Inode, such tools will scan 10x this file and perform 10x the checksum. Not really efficient for my case. 
 
-So, I was forced to write one system taking into account the hardkinks and after evaluation using [adler](https://en.wikipedia.org/wiki/Adler-32). Adler is one of the fastest checksum I've tested. 
+So, I wrote yabitrot taking into account the hardkinks and after evaluation, I decided to use [adler](https://en.wikipedia.org/wiki/Adler-32) as checksum. Adler is one of the fastest checksum I've tested. 
 
-So, yabitrot is based on the Inode of each files. For each Inode yabitrot compute the checksum and store the timestamps at which this calculation has been made.
+In short, yabitrot is based on the Inode of each files. For each Inode yabitrot compute the checksum and store the timestamps at which this calculation has been made.
 
   
 
@@ -46,7 +46,7 @@ During the next runs, yabitrot will compare the checks
 
 On OpenBSD, I suggest you trigger yabitrot via the weekly.local or via the monthly.local file. You will thus receive an email listing the files having bitrot issues. 
 
-You must not be root to run it, but you have to run it with the user having enough permissions to read all targeted files and to store the DB on the root of your targeted folder. 
+You must not be root to run it, but you have to run it with the user having enough permissions to read all targeted files and to store the DB at the root of your targeted folder. 
 
 
 
@@ -61,7 +61,7 @@ I'm running this script since several weeks now withou
 
 I can scan my NAS disk of 720GB in 2h40 on my small board with 2 CPU of 3.3GHz and 4GB of Ram. This disk has 6.8 millions files, but 760.000 Inodes (so each files has +- 9 different names). 
 
-Here after the details after the 1st scan. 
+Here after the log's details after the 1st scan. 
 
     No cleanup required
     760238 files added
@@ -79,6 +79,15 @@ We see that such DB takes 29MB !!!
     drwxr-xr-x   4 root  wheel   512B Mar 24  2018 ..
     -rw-------   1 root  wheel  29.1M Oct 13 20:37 .cksum.db
 
+Here after the log's details after the next scan:
 
+    Wed Oct 17 07:54:04 2018: 1025 files removed from DB
+    Wed Oct 17 07:54:04 2018: 1302 files added
+    Wed Oct 17 07:54:04 2018: 205 files updates
+    Wed Oct 17 07:54:04 2018: 0 files error
+    Wed Oct 17 07:54:04 2018: 6094853 files analysed in 8643.72 sec, 706.798 GB
+    Wed Oct 17 07:54:04 2018: 760515 entries in the DB
 
 
+
+