External HDD preparation best practices

19 respuestas [Último envío]
hack and hack
Desconectado/a
se unió: 04/02/2015

I just bought a 4TO disk, and I don't want to start again with such amount of space, so I'm trying to get it right from the start for once.
I'm thinking of :
- formatting the whole thing (which format? ext4 is GNU/Linux only I think, btrfs I don't know if it's worth it, ntfs maybe to connect to other systems than GNU/Linux, which I don't need to... ).
- partitioning or not? I read it would be a good idea to prevent data loss, but that's very vague. The initial idea is to use it as a mass storage disk, and use my other smaller disks as back-up disks and travel disk (if needed, most likely not). So instead of partitioning for data storage, I could maybe partition one part for backup, and the rest for storage.
- passphrase protection: I'm sure it's manageable from GParted. But that's yet another passphrase to remember. Written data is more protection worthy than non-personal pictures, videos and music. I guess I'll have to make up my mind by myself on this one.

So, what would you do, and why?

Added info:
Looks like ext4 is the fastest overall, followed by btrfs, and way behind ntfs, which isn't useful since I don't plan to share this one with proprietary OS. But is it a problem if my OS is btrfs and my drive is ext4?
https://www.phoronix.com/scan.php?page=article&item=linux-40-hdd&num=2

For partitionning, conflicting advice:
"The other benefit is it will speed up searches for songs or pictures if you only need to look in 250 GB vs 1000GB for the file."
-
"Partitioning is recommended only for HD's which is used to store data AND programs."
-
"it's debatable - I'd still go for a couple of partitions to speed up search"
-
"You also need to think about backup - this stuff is precious! Backing a terabyte up down a USB channel isn't exactly quick! So you might want to have two partitions, one for finished videos and the other a work volume" (whaat, backing up that data too??)
-
"Fragmentation may be an issue, I agree. However, if I have to choose between fragmentation issue -vs- free space management issues, I'd choose fragmentation issue anytime.
It's a lot easier to defragment a fragmented drive than to shuffle files between partition, or -- God forbid -- repartitioning the drive." (well, LVM should be ok to resize the partitions if ever needed, but what about fragmentation? Should I worry about it?)
-
"you can independently do one drive without disturbing the other, u can put multiple boot disks and different OS, defrag seperately, more organized,...etc."
-
-
https://www.cnet.com/forums/discussions/partition-or-not-partition-331189/

It looks like partitioning is wort it. Any thoughts?

hack and hack
Desconectado/a
se unió: 04/02/2015

LUKS/LVM from GParted (as well as setting up passphrase protection) seem to be available from GParted.
http://thesimplecomputer.info/full-disk-encryption-with-ubuntu

tl;dr:
I'll probably partition the drive, but strictly for data organisation (video, documentation, pictures, audio).
I'll decide the size of each partition in relation to my existing data.
Yet for sure, backing up the computer's data, I'd rather do that on a smaller drive (more writes, it feels safer on a smaller drive since less data/memory is involved). And I'll definitely use a passphrase. Last but not least, I'll have to set up the partitions to be resizable.

Questions left:
I wonder if I should keep, I don't know, a third of the drive for backing it up on another partition on this same drive.

Also, ext4? btrfs?

And defragmentation: should I worry about it?

Also, do you know any cataloguing software? Should I use one in that case?
It looks like Libo Base is an option, but I don't understand how stocking data in a drive (which will be unplugged most of the time) and a database and a way to retrieve that data (yes, I don't know exactly how this works) is better (faster, cleaner, safer) than looking for it from a standard file manager.
Also, does that mean that I'm wasting time organizing my data into folders on my PC?

Magic Banana

I am a member!

I am a translator!

Conectado
se unió: 07/24/2010

We are not talking about a disk that will contain a operating system, right?

To organize the files there is something called directories (aka folders), which do not bring any partitioning issue. File searches can be restricted to a directory.

One directory can be for the backups, e.g., with Trisquel's "Backups" tool in the "System Settings". However the backups may end up occupying the whole disk. "Backups" does that (and automatically removes the older versions of the files when there is a shortage of space). That said, since it backups in an incremental way, it may take quite some time to fill up the whole disk. Well, it depends on how much data you back up and how often those files are modified...

Anyway, backups may end up filling the whole partition (it is "normal"), it makes sense to have a separate (and large enough) partition for files you want to carry, besides the partition (the rest of the disk) for the backups. And I am talking of backing up files from another disk (and the external can be backup on the other disk: "cross backups"). Indeed, although you can backup on the same disk, it is not particularly safe: the disk (the hardware) can fail, suffer from an accident (it falls), it can be stolen, etc. In such eventualities, the data on all partitions are lost at once.

Modern filesystems do not suffer from fragmentation problems until they are almost full. You can choose any filesystem and they will be readable from any operating system with a kernel that supports the type of the filesystems. XFS is often said to more efficiently deal with large files (megabytes already is considered large). However XFS cannot be shrunk. That would be a problem is you choose to have several partitions but oversize a partition with an XFS filesystem and need more space for another partition. Btrfs is efficient but it still is rather new: the probability to lose data, although very small, is higher than with a more mature filesystem type such as ext4. That said, it is users people do not adopt btrfs that it keeps on suffering from this novelty issue (it actually is not that new anymore).

So, if I were you, I think I would go with two partitions: one for files to carry (I do not know how much you want to carry but you had better over-dimension than under-dimension that partition), organized in folders, and the other partition (the rest of the disk) dedicated to the backups. Both can be formated in ext4 but you may prefer NTFS for the first partition (so that you can share files with Windows users).

hack and hack
Desconectado/a
se unió: 04/02/2015

Yes, it's strictly a data external drive, no OS.

Basically, I want mass storage (well organized) and backup (like dejadup or back in time, since clonezilla seems to be for mass deployment, or at least an simpler system reinstall, not so much for data).

I have one huge 4to external drive, and a few more smaller external drives.
I want the 4to one to be the warehouse, not used often, not even plugged in at all times, but well organized in folders, as you suggested. Maybe I'llsave the really important data on some DVDs, or I can try the cross backup you talk about (two disks backing up one another, right?).
But I don't get it: let's say my backup drive fails. Why would a cross backup matter since I have the original data on the computer's disk anyway? Or is it to have older versions of the same files?
I can see how "self-backup" (the data's drive backed up in the same drive) can be a bad idea.
And good to know that partitioning is now a worry of the past, thanks.

I want one of the smaller drives to be used as a backup mainly, and partly as a storage extension, like you suggested (but since it's not the main storage, I can even skip that and use this whole dis strictly for backup). If I backup my home folder (or most of it), and a few system files, That should leave maybe some wasted space (which I can define with a partition and use as extra storage maybe).
But how much space to leave for the backup partition? Should I double or triple the amount of disk space I expect to backup?

Yes, I thought folders would be enough, but reading about collections managers (basically database management with indexing if I get it right), I thought this might be faster/more convenient. Even on a super large drive, partitioning wouldn't make a search faster if the data is organized in folders.
That's comforting me in simply having a well organized drive.

Also, taking GCstar as an example, I don't like the automatic online data retrieval regarding privacy.
And I doubt I'll need to make queries by criteria.
https://en.wikipedia.org/wiki/GCstar

From what I understand, I should format my 4to drive to ext4, encrypt it, and don't bother with partitioning.
Then, I take a drive strictly for backup, encrypt it as well I suppose (I think the same passphrase for the two drives should be fine, my memory has limits), and make sure it can hold twice or trice the disk space I want to backup (wild guess).
But if that "smaller" disk is still too big, and I don't need to have 50 versions of the same file, I might just partition it to have extended disk space available.

Basically, I need to learn the encrypting procedure in both cases (plain storage, the info is somewhere in the link I posted about GParted. For backup, I need to find out inside the software I will choose. Or maybe just encrypt it with GParted, then install the software. The latter makes more sense), and I need to figure out how much space is enough for backup.

Magic Banana

I am a member!

I am a translator!

Conectado
se unió: 07/24/2010

I'llsave the really important data on some DVDs

Notice that DVDs are not eternal either.

Why would a cross backup matter since I have the original data on the computer's disk anyway?

If all files on the disk are stored on another disk as well, it does not matter. I thought that you wanted the disk to have the only copies of some large files and a backup of these same files on the same disk.

I want one of the smaller drives to be used as a backup mainly, and partly as a storage extension

If it is smaller, you will probably not have enough space to backup the larger disks. Backups can be compressed ("Backups" compresses by default) but, files on personal computer (movies, pictures, music, LibreOffice documents, etc.) usually are in compressed formats. As a consequence, basically no space is gained by compressing what already is compressed.

But how much space to leave for the backup partition? Should I double or triple the amount of disk space I expect to backup?

That looks reasonable. With an incremental backup ("Backups" does that by default too), you will be able to go quite far in the past (recover files removed quite some time ago or as old versions of files that were modified).

Yes, I thought folders would be enough, but reading about collections managers (basically database management with indexing if I get it right), I thought this might be faster/more convenient.

Well, that may be my ignorance speaking but I do not believe that searching in a folder would be faster if that folder is alone in a partition. I am talking about a search restricted to the folder in question of course.

For backup, I need to find out inside the software I will choose.

With "Backups", it is a box to check. But if you encrypt the partition, it looks useless to also encrypt the backup.

hack and hack
Desconectado/a
se unió: 04/02/2015

Ah sorry, I didn't explain properly:

A is my PC HDD, with the OS within, and some data of course.
B (as in "Backup) is the external backup drive, either the same size, or a bit larger than A. It's smaller than C though.
C is the huge HDD used for storage, unplugged from the PC most of the time.

-

So there are two separate ideas, because B should have the same files as A since it backs it up (so it's ok if either one crashes), but for C, I was thinking of the DVDs for backing up a few specific files. Though these files will probably be on A and B too, so I guess more is really overkill.

I was considering indeed (at first) backing up all the files of C inside itself, and you didn't recommend it (plus It would reduce the space for data by half). But it's true that DVDs don't last for too long.

-

Well, that may be my ignorance speaking but I do not believe that searching in a folder would be faster if that folder is alone in a partition. I am talking about a search restricted to the folder in question of course.
Yes, I too doubt partitioning would speed up anything.

-

With "Backups", it is a box to check. But if you encrypt the partition, it looks useless to also encrypt the backup.
Ah, that's simplifies things, thanks for the info.
DejaDup has some bad press though (maybe because of the user, but still): https://www.reddit.com/r/Ubuntu/comments/4blx2o/should_we_replace_deja_dup_as_the_default_backup/

The only bad thing I found about Back In Time for now is this:
There's a "gotcha" with backintime - "dot" files are excluded by default. If you want your home directory's dot files, use backintime's Settings->Exclude and remove .*
https://askubuntu.com/questions/2596/comparison-of-backup-tools

EDIT: It seems it's been fixed (https://github.com/bit-team/backintime/issues/315).

-

So I suppose I should format C as ext4 with encryption.
I should format B as ext4 (no encryption) and set the software to backup on B, with encryption from the software.
And maybe get a few files on DVD just in case the 3 HDD crash all at once (obviously highly unlikely).

That means a whole bunch of passphrases. I probably need to learn how to use gnome-keyring and Seahorse so I don't have to memorize them all.

Did I miss something? Is there a better idea? Is there a way to backup C as a whole, like using yet another huge drive and clone it there periodically (that seems quite complicated)?

Magic Banana

I am a member!

I am a translator!

Conectado
se unió: 07/24/2010

I believe I would simply use A and B to store different files (on B, those you want to share) and backup both on C. Two partitions on A (/ and /home), one on B, one on C.

And, yes, DéjàDup has (minor) problems: whenever you want to backup (notice that you can define a fixed period and DéjàDup warns you the backup is postponed if the disk is not plugged):

  1. plug the disk;
  2. close the error message;
  3. open "Backups" (in the "System Settings") and choose "Back Up Now...".

Once the backup is over, ignore the error message that tells you that ".dconf" could not be backed-up. To ease the process, you can create a launcher with 'deja-dup --backup' as a command. Clicking that launcher substitutes the third step above.

hack and hack
Desconectado/a
se unió: 04/02/2015

So you would extend A with B, and use C as the backup area for both.

It can work of course, but in my situation, you must imagine C holding already nearly 3to of data. That still should leave enough for backup, but I'd rather keep C unplugged as often as possible, so that on A, there's only the data I currently need, and B is the backup of A.

The only missing piece would be to backup C in some way, but it's probably overkill (plus the REALLY important data will already be both on A and B).

Now there's the thankless job of organizing all that data...
But'll sure enjoy it after it's done :)

Sasaki
Desconectado/a
se unió: 08/11/2014

Just to say there is a preacution while formatting SSD hardrives, if you plan to buy one. Mine is broken since I formatted it in fat32. Now it takes five hours to copy 2Go of date from the disk to the computer.
I've read somewhere you must format that type of disk with care, letting some space in the partition, or upload the firmware or something like that, I can't retrieve the page.

hack and hack
Desconectado/a
se unió: 04/02/2015

Thanks Sasaki, definitely worth knowing.

About encrypting an external drive, it seems gnome-disks is a tool to use, not GParted (that previous link I posted was about an OS install): https://help.ubuntu.com/community/EncryptedFilesystemsOnRemovableStorage

hack and hack
Desconectado/a
se unió: 04/02/2015

I have a hard time finding tutorials about using gnome-disk-utility (aka gnome-disks, also disks I think).
There's one small thing that worries me a little, it's that at first I had one unique partition, and now I have two.
It might be normal since here (https://askubuntu.com/questions/805485/how-does-luks-work) it's talked about LUKS bringing a layer.
and in the gnome-disks window, I have indeed the top layer with a lock symbol.

Here's what I did:
1. I erased existing partitions, deleting the manuals that were in there (all pretty much useless plus I backed it all up just in case).

2. Having one partition left, I tried the menu "wheel" icon at the top, I assume it's for the whole disk, sinche there's the same wheel when selecting a partition, so that's inside the window, not at the top.
Formatting indicated a progress bar. But no LUKS, so I canceled it.

3. Then I tried the same format action but on the unique partition, from the menu inside the window. There I could select the LUKS+ext4.

-

The only indicator that something is going on is a small waiting wheel (like the mouse indicator) next to the drive name in the left column.
But I know it will take about 4 days, because there was a progress bar when I tried at first to format only.

So I'm hoping this is going well.

SuperTramp83

I am a translator!

Desconectado/a
se unió: 10/31/2014

hi hack mate!

To encrypt a stick I used this guide, which is very clear.

http://www.howtogeek.com/115955/how-to-quickly-encrypt-removable-storage-devices-with-ubuntu/

hack and hack
Desconectado/a
se unió: 04/02/2015

Thanks SuperTramp, it's pretty much what I did (luckily!).

It's just that, in that dark grey window showing the partition, I used to have one, but starting the wipeout+encryption, I have now two.

Though, which is reassuring, both have the same size, so that still makes it one partition.
I still feels weird. I wonder if wiping out with zeros one unique partition from the inner (partition) menu is the same as doing it from the outer (whole disk) menu.

Maybe I should stop the process, and instead of doing both at once:
1. wipeout the whole disk with zeros first
2. format and encrypt.

Or maybe it's like that because LUKS necessarily would imply LVM, I don't know.
It looks like I can change partitions easily, so that should be LVM. then LUKS is implemented over it.
That's the most sound explanation I can find.

In other words, wiping out one (unique) partition or the whole disk is exactly the same.
Everything in that unique partition will be encrypted anyway.

So if I'm not mistaken, I shouldn't worry.

And that also would mean that a vertical split shows a partition separation while a horizontal split would show LUKS over the encrypted partition (the LVM part).

SuperTramp83

I am a translator!

Desconectado/a
se unió: 10/31/2014

Truth is I don't know, hack. My knowledge of disk encryption is quite limited. Hopefully someone more experienced will be able to reassure you.

hack and hack
Desconectado/a
se unió: 04/02/2015

Well, I'll tell you how it went in 3 days then :)

hack and hack
Desconectado/a
se unió: 04/02/2015

It's been 5 days now, without still no indication of how long is left for formatting/encrypting.

It's basically this issue: https://askubuntu.com/questions/19049/disk-utility-running-for-a-long-time

Any idea?
For now I'll leave it running.

For now, it is unlocked, and when I try to lock it from disk-utility, I get:

Error locking /dev/sdX: Command-line 'cryptsetup luksClose "luks-bunchofalphanumericalcaracters-andsomemore" exited with non-zero exit
status 5:device mapper:remove ioctl on luks-samealphanumericalnumbers failed: Peripheral or resource busy
-
and then from "device-mapper" it loops.

EDIT:
It might be linked to this https://bugs.launchpad.net/ubuntu/+source/cryptsetup/+bug/484429

I know why I can't unplug it or lock it, because the formatting is probably still going on. the problem is that I don't know for sure it's still going on, and for how long.https://bugs.launchpad.net/ubuntu/+source/cryptsetup/+bug/484429

htop shows 4 gnome-disks processes, 2 indicate 0 in the TIME+ column, one seems always stopped and shows 12 hours, while the remaining active process shows 65 hours and starts and stop every few seconds, irregularly.

I syuppose it's still working, but it's only a guess.

SMART data shows zero issues on the drive btw.

I can't seem to find any log about gnome-disks, that would hopefully say if everything is going fine or not.

hack and hack
Desconectado/a
se unió: 04/02/2015

Ok, so I ended up stopping the formatting, and did a new one (but fast this time) on the whole drive, and did it again with LUKS+ext4. Plus it's a brand new drive, so it should be more than fine.

It's troublesome that it keeps going forever though, and also that there's no progress indicator/bar/%.

anyway, this problem aside, I tried to use the CLI to first unlock it, then mount it, then unmount it, then lock it back.
A fine example of wasting time on the CLI (plus I couldn't make it work, with more than half an hour of research).

So I tried on Nautilus. Prompted for my pass phrase, worked.
End of the story.

Just make sure to clic on "immediately forget the passphrase", because if you don't, you'll be able to unplug it and back without the PC asking for the passphrase (at least until you restart the PC).

So in a nutshell, I had troubles with slow formatting.
Maybe I should have done it on the whole disk first instead of from the partition.

Also, on my Trisquel netinstall, I don't have the authorization to open the external HDD.
I don't have any other software blocking this.
I'll try again after restarting.

EDIT:
For those interested, this worked (the first line, about unlocking. I could then mount with a file manager).

Also fdisk says my drive has a wrong partition table, even if it appears on the file manager when plugged.
I think my Trisquel install is a mess...

hack and hack
Desconectado/a
se unió: 04/02/2015

Just so you know, I noticed that after the drive being plugged for a while, the option for safely removing the drive disappears :(

Gotta go through gnome-disks to stop the disks from spinning for a while after unmounting. I suppose this should be a bug report to whoever keeps taking care of this software.
Should this be a distro-based bug report, or should I find out who deals with Nautilus?

Magic Banana

I am a member!

I am a translator!

Conectado
se unió: 07/24/2010

That would be for Nautilus' developers... but they will probably ask you to use the latest version of Nautilus to confirm that the bug still exists.

hack and hack
Desconectado/a
se unió: 04/02/2015

Got it, thanks.