SAS HDD Drive

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
tsdnz
Member
Member
Posts: 333
Joined: Sun Jun 16, 2013 4:09 am

SAS HDD Drive

Post by tsdnz »

Hi, I am about to write my SAS HardDisk driver.
Has anyone had the SAS not register on the PCI list?
It registers the RAID controller, but no mass storage devices.

Any ideas?
User avatar
Geri
Member
Member
Posts: 442
Joined: Sun Jul 14, 2013 6:01 pm

Re: SAS HDD Drive

Post by Geri »

you are prety much petting some too special new age boi here, i not yet even heard from it, i had to google it

i think it should work just like regular sata, the rest of the problems should be due to the bugous firmware in your motherboard. as logically this seems to be just some sata device, just some different wiring. try enabling ATA emulation mode in bios (uefi), and if it works, just tell the users to use it in legacy emulation mode, you dont really need to support every braindead thing that knocks on your door.

if NONE of the above helps, i suggest using bios interrupt 13h to access the drive if you are booted from it, or if its the secondary drive (chs, then lba above 8g places). if its handled by a separately initialized chip, you dont even have other chance than this, as in that case you need to write a driver for that circuit, and i would really really not do that, as after a year, they will not manufacture that practicular thing.

if you are about to touch the hardware with your own drivers, you should basically have multiple generic disk fallbacks to ensure you can always access the disks (even usb pendrives), and bios fallback is the best way to go, but it cant handle more than 2 disks.

now this kind of hardware belongs to servers, you maybe expected some more special and ,,professional'' solution on that, well then i must disappoint you, these kind of stuff are more crappy than the oem hardware for regular users.
Last edited by Geri on Tue Aug 15, 2017 6:38 pm, edited 1 time in total.
Operating system for SUBLEQ cpu architecture:
http://users.atw.hu/gerigeri/DawnOS/download.html
tsdnz
Member
Member
Posts: 333
Joined: Sun Jun 16, 2013 4:09 am

Re: SAS HDD Drive

Post by tsdnz »

Thanks, I have tried the PCIe MCFG Table using MMIO, still the same the IDE drives are not listed in the PCI.
I have gone right through BIOS, will do it again.
Not sure if the PCI RAID needs to be (?????? = Setup or Enabled or ????)

Tricky one, it is working on my other servers.
User avatar
Geri
Member
Member
Posts: 442
Joined: Sun Jul 14, 2013 6:01 pm

Re: SAS HDD Drive

Post by Geri »

i dont think you will be able to use it in any other mode, just in nornal generic ide mode. if you set it as raid mode, even windows needs a separate driver to access such disks (like a driver floppy on install, if you install on it). but it will PROBABLY still work with CHS access even if its set as RAID.
Operating system for SUBLEQ cpu architecture:
http://users.atw.hu/gerigeri/DawnOS/download.html
tsdnz
Member
Member
Posts: 333
Joined: Sun Jun 16, 2013 4:09 am

Re: SAS HDD Drive

Post by tsdnz »

Geri wrote:i dont think you will be able to use it in any other mode, just in nornal generic ide mode. if you set it as raid mode, even windows needs a separate driver to access such disks (like a driver floppy on install, if you install on it). but it will PROBABLY still work with CHS access even if its set as RAID.
Thanks, I have to detect or find it first. LOL
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: SAS HDD Drive

Post by Brendan »

Hi,
tsdnz wrote:Hi, I am about to write my SAS HardDisk driver.
Has anyone had the SAS not register on the PCI list?
It registers the RAID controller, but no mass storage devices.

Any ideas?
Unfortunately, for SCSI controllers (including SAS/Serial Attached SCSI) and RAID controllers (including SATA RAID controllers), there is no standard programming interface for the controller (like the AHCI standard for SATA controllers); and you need different drivers for different controllers.

To have some idea of what is going wrong; someone would need to know which specific SAS/RAID controller it is (and the datasheet/programming manual/documentation for that specific SAS/RAID controller), and they'd need to see your code to compare against the controller's documentation (to determine why your driver doesn't find any devices attached to the controller).

Note that (because there's no standard programming interface and because it's much less common than "commodity SATA" hardware); very few people have any experience with writing drivers for any SAS, SCSI or RAID controllers. For example (for me specifically), I took a brief look at a datasheet for one of Adaptec's SCSI controllers in the 1990s, forgot whatever I saw, and haven't looked at any other SCSI or RAID controller since.

Also; you can expect that any controller that supports RAID is going to be significantly harder to implement due to the possibility of RAID itself - not just figuring out how many drives are attached but extra stuff for detecting/configuring how logical devices are created from individual drives (JBOD, RAID 0, RAID 1, ....); and not just basic error handling and S.M.A.R.T but extra stuff for managing things like "RAID array reconstruction".


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
tsdnz
Member
Member
Posts: 333
Joined: Sun Jun 16, 2013 4:09 am

Re: SAS HDD Drive

Post by tsdnz »

Thanks Brendan.

I have just discovered how much work it is going to be.
Just getting the right manual/docs is tricky enough.
I might get a SAS/SATA controller and install that.

Ali
tsdnz
Member
Member
Posts: 333
Joined: Sun Jun 16, 2013 4:09 am

Re: SAS HDD Drive

Post by tsdnz »

Can anyone recommend a SAS controller for a Dell 720 server?
I currently have a PERC H710 Mini.
Basically I want it to just register the drives on the PCI.

Thanks, Ali
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: SAS HDD Drive

Post by Brendan »

Hi,
tsdnz wrote:Can anyone recommend a SAS controller for a Dell 720 server?
I'd expect the same kinds of problems for all SAS controllers.

Note: I was tempted to suggest an Intel controller (as Intel often do provide documentation), but then I tried to find documentation for one of their SAS/RAID controllers (an Intel RS3DC Family controller) and failed to find any. I suspect that it's one of the cases where (one day, when your OS is successful enough for companies to care) you'd negotiate with the manufacturers and they write the driver (or maybe they give you information after you've signed a non-disclosure agreement).

For now; I'd recommend forgetting about SAS. For alternatives; both AHCI/SATA and NVMe are standardised and documented.
tsdnz wrote:I currently have a PERC H710 Mini.
As far as I can tell this is created by LSI Logic/Symbios Logic (and re-badged by Dell). I couldn't find any datasheet/manual from the manufacturer, but there is an open source "meta-driver" (for a group of similar controllers including the PERC H710 Mini) for Linux that might help (e.g. the "megaraid_sas*" files here).
tsdnz wrote:Basically I want it to just register the drives on the PCI.
I don't know what this means. The drives are never on PCI (they're connected to some kind of controller). Only the controller is on PCI.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
tsdnz
Member
Member
Posts: 333
Joined: Sun Jun 16, 2013 4:09 am

Re: SAS HDD Drive

Post by tsdnz »

Thank you Brendan, a very wise response.

My SATA code was very easy.
I assumed SAS was going to be the same, I was wrong.
I will get some SATA drives for my servers and test using the existing controllers (They accept SAS and SATA)
Hopefully this works, otherwise I will have to buy different SATA controllers, not much of an issue really.

Again, thanks for your comments, they were very helpful!!

Ali
LtG
Member
Member
Posts: 384
Joined: Thu Aug 13, 2015 4:57 pm

Re: SAS HDD Drive

Post by LtG »

Pretty much what Brendan said, but is there a reason you want/need SAS?

If it's a curiosity then I guess you're mostly on your own and looking for either data sheets or Linux drivers.. If it's because you'd like to use it, then it might be better to just forget it and use something else (SATA/PATA).

I've worked with SAS drives (for work), but never done any driver dev, so I can't help with programming them either.. Never tried to look for data sheets so I've no idea how hard they are to come by and how easy it would be to implement the drivers. Personally wouldn't even bother =)
tsdnz
Member
Member
Posts: 333
Joined: Sun Jun 16, 2013 4:09 am

Re: SAS HDD Drive

Post by tsdnz »

Thanks LtG, I have SAS drives in a few test servers. 4 TB each drive. plenty of drives. I just thought I would write a driver for SAS as SATA was easy, well easy in VirtualBox.
But when I tried to find the drives on the servers just to get started with PCI vendor / device ID's I could not find them.
This is when I started looking deeper. only to find the hole is very deep. LOL

Ali
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: SAS HDD Drive

Post by Brendan »

Hi,
tsdnz wrote:My SATA code was very easy.
This comment worries me a little. It's easy to create bad SATA code that "works" (in that it can read/write some sectors); but good SATA code is a significantly challenge.

By good SATA code, I mean something that:
  • supports full auto-detection/auto-configuration (including MSI) for both controllers and attached SATA devices
  • handles drive hot-plug and hot-unplug properly (including "unexpected hot-unplug under load")
  • supports more than just hard disk (e.g. CD/DVD read, write, eject)
  • supports the more advanced features (NCQ, secure erase, TRIM, etc)
  • handles SMART and all possible kinds of errors and drive failures (including communicating with some kind of hardware monitoring tool for administrators, including "early warning that drive will fail soon")
  • supports power management (including throttling in response to high temperature and reducing power consumption when idle)
  • supports IO priorities (so less important transfers don't ruin the performance of more important transfers)
  • cooperates with OS's caches, including having a robust "write ordering with sync" model for synchronisation

Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
mallard
Member
Member
Posts: 280
Joined: Tue May 13, 2014 3:02 am
Location: Private, UK

Re: SAS HDD Drive

Post by mallard »

Brendan wrote:unexpected hot-unplug under load
The only "proper" way to handle that is to alert the user that their data is most likely lost and the filesystem corrupted. The "dirty" bit on most reasonable filesystems provides an indication that recovery is necessary. If it occurs on your "system drive" (the device that contains vital OS files or swap space) you have no choice but to "panic" (or maybe terminate every process that has data swapped-out to the now-unavailable device, assuming that's nothing critical). You might be able to get away with a "please re-connect that device immediately" type message in some very limited cases.

Note that there's no way to tell the difference between "user unplugged the device at a bad time" (any user-unpluggable device should have very conservative caching policies and as resilient a filesystem as possible to reduce the impact of this action) and "hardware failure" (e.g. the power supply to an external device frying itself, the user's cat biting through a cable, the device just deciding that it doesn't want to work anymore, etc.).
Brendan wrote:TRIM
Except on very early model SSDs, TRIM varies from useless to dangerous. Competent SSDs internal garbage-collection routines are far more capable and TRIM often has a severe and unpredictable performance penalty (especially on devices that only support the non-queued version). In the best devices it's simply a no-op and in the worst it's buggy and corrupts your filesystem. Unless you've got the resources to test every SSD (family) on the market to work out the small minority of devices where it's correctly implemented and actually improves performance, it's best avoided.

If you also have an encrypted filesystem (something any serious OS should support in 2017) it's even worse, because even in unlikely case that it's implemented correctly by the hardware and necessary for that device, TRIM reveals metadata about which blocks are/are not used by your filesystem over an insecure channel. This metadata is then stored unencrypted in the "private space" on the SSD, where it's easily available to any sufficiently determined attacker.

TRIM is best considered a mistake in the ATA command set and ignored.
Brendan wrote:SMART
As pointed out in the linked Wikipedia article, it's impossible to predict whether SMART is going to be available and work correctly in any given configuration. SMART data may be "dropped" by a RAID controller, USB interface, etc. Cheaper devices (particularly SSDs) do not support it. Even where it is available, the attribute numbers are non-standard and vendor-specific, so all you can do is guess that there's a problem based on the "common" list of attributes and display a warning (you have no way of knowing whether these "common" attributes mean what you think they do). Handling "all possible kinds of errors and drive failures" is completely, 100%, impossible.

SMART is one of those things that's a great idea in theory, but is only "half a standard" and barely works most of the time. Even if you do implement it, all you need in a low-level hardware driver is functionality to indicate support and to read attributes by number. Anything else should be in a higher layer of the storage subsystem or userspace utility somewhere.
Brendan wrote:
  • supports IO priorities (so less important transfers don't ruin the performance of more important transfers)
  • cooperates with OS's caches, including having a robust "write ordering with sync" model for synchronisation
These are definitely features that should be implemented in a higher level of your storage subsystem and are not concerns of the low-level hardware-interfacing storage driver in any reasonably structured system.
Image
LtG
Member
Member
Posts: 384
Joined: Thu Aug 13, 2015 4:57 pm

Re: SAS HDD Drive

Post by LtG »

mallard wrote:
Brendan wrote:unexpected hot-unplug under load
The only "proper" way to handle that is to alert the user that their data is most likely lost and the filesystem corrupted. The "dirty" bit on most reasonable filesystems provides an indication that recovery is necessary. If it occurs on your "system drive" (the device that contains vital OS files or swap space) you have no choice but to "panic" (or maybe terminate every process that has data swapped-out to the now-unavailable device, assuming that's nothing critical). You might be able to get away with a "please re-connect that device immediately" type message in some very limited cases.
Why only in some very limited cases? Assuming the device itself is left in a sane state and assuming you have the "core OS" in RAM, what issues are there? You can halt all processes which try to access swapped out RAM.

Especially for non-system drives this shouldn't even be that hard.. And no data should be lost, all the queued write ops should be possible to complete once the device is plugged back in.
mallard wrote:
Brendan wrote:TRIM
Except on very early model SSDs, TRIM varies from useless to dangerous. Competent SSDs internal garbage-collection routines are far more capable and TRIM often has a severe and unpredictable performance penalty (especially on devices that only support the non-queued version). In the best devices it's simply a no-op and in the worst it's buggy and corrupts your filesystem. Unless you've got the resources to test every SSD (family) on the market to work out the small minority of devices where it's correctly implemented and actually improves performance, it's best avoided.
Any references for that? I wasn't aware of TRIM being useless, and tried to quickly google but came up with nothing. The TRIM Wikipedia article also says that it's useful.

I know some devices have buggy firmware, but that applies to pretty much all hardware. You don't need to test every device out there, it's reasonable to expect devices to work according to spec and blame manufacturer for hardware defects. Also you can just look at Linux source to find out which devices have issues, the Wikipedia article also lists these. It's only a handful of models/families really..
mallard wrote: If you also have an encrypted filesystem (something any serious OS should support in 2017) it's even worse, because even in unlikely case that it's implemented correctly by the hardware and necessary for that device, TRIM reveals metadata about which blocks are/are not used by your filesystem over an insecure channel. This metadata is then stored unencrypted in the "private space" on the SSD, where it's easily available to any sufficiently determined attacker.
I agree encryption is important. Apart from completely crippling performance, I'm not sure there's much you can do..

AFAIK, even if you don't use TRIM you still have the same issue. The way "better" SSD's handle performance is by having an internal pool of unused blocks (so say a 512GiB drive actually has 512+x GiB of storage, only exposing 512GiB to user) and uses those for writing and then it can deallocate (TRIM) the old used blocks. The main issue is that it's rather slow to erase a block, so you have to keep a pool of already erased blocks to have decent performance. Whether the FS/OS does it or the SSD does it internally, does it matter?

I haven't really done much research into the security implications of TRIM w.r.t. encryption, so I'd like to know whether the SSD's internal "TRIM pool" already has the same effect on encryption and if it does then using TRIM shouldn't harm..?


As for SMART, I never liked that its not as good as it should've been. Realistically there's only a handful of failure modes for HDD's (and probably just a handful for SSD's as well), and there's only a few HDD manufacturers which means they each have decades of experience. Being able to detect in imminent failure in many if not almost all cases is something they should be able to do, but it seems they really don't.. One problem of course is that there's a conflict of interest in the consumer drives, notifying the user of imminent failure (which might still take months) might cause the user to replace the drive under warranty, while not notifying the user and allowing data loss months later might occur after the warranty has expired. So the manufacturer has no real incentive to provide good SMART...
Post Reply