Quantcast
Channel: Active questions tagged kernel-modules - Unix & Linux Stack Exchange
Viewing all articles
Browse latest Browse all 1183

Allocate RAM block device faster than Linux kernel can normally allocate memory

$
0
0

Background

I'm trying to download about 150GB to a newly-created Linux box (AWS EC2) with 100gbps network connection at full speed (12.5GB/s) or close to that. The network end is working well. However, I'm struggling to find anywhere on the box that I can put all the data fast enough to keep up, even though the box has 192GB of RAM.

My most successful attempt so far is to use the brd kernel module to allocate a RAM block device large enough, and write into that in parallel. This works at the required speed (using direct io) when the block device has already been fully written to, for example using dd if=/dev/zero ...

Unfortunately, when the brd device is newly created, it will only accept a write rate of around 2GB/s.

My guess is that this is because brd hooks into 'normal' kernel-managed memory, and therefore when each new block is used for the first time, the kernel has to actually allocate it, which it does no faster than 2GB/s.

Everything I've tried so far has the same problem. Seemingly, tmpfs, ramfs, brd, and everything else that provides RAM storage hooks into the normal kernel memory allocation system.

Question

Is there any way in Linux to create a block device out of real memory, without going through the normal kernel's memory management?

I'm thinking that perhaps there is a kernel module out there that will split off an amount of memory at boot time, to be treated like a disk. This memory would not be considered normal memory to the kernel and so there would be no issue with it wanting to use it for anything else.

Alternatively, is there some way to get the kernel to fully initialise a brd ramdisk (or similar) quickly? I tried writing to the last block of the disk alone, but unsurprisingly that didn't help.

Non-RAM alternative

In theory, a RAID of NVMe SSDs could achieve the required write speed, although it seems likely there would be some kind of bottleneck preventing such high overall I/O. My attempts to use mdadm RAID 0 with 8 NVMe SSDs have been unsuccessful, partly I think because of difficulties around block sizes. To use direct io and bypass the kernel's caching (which seems necessary), the only block size that can be used is 4096, and this is apparently far too small to make efficient use of the SSDs themselves. Any alternative here would be appreciated.

Comments

I know 2GB/s sounds like a lot, and it only takes a couple of minutes to download the lot, but I need to go from no EC2 instance at all to an EC2 instance with 150GB loaded in less than a minute. In theory it should be completely possible: the network stack and the physical RAM are perfectly capable of transferring data that fast.

Thanks!


Viewing all articles
Browse latest Browse all 1183

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>