Starting a blog!
Published:
I figured it would be fun to keep a record of all the projects I’m working on in a blog format, so I’ve set up the blog page on my website! You’ll be able to access this page at any time by going to ahl27.com/blog!
less than 1 minute read
Published:
I figured it would be fun to keep a record of all the projects I’m working on in a blog format, so I’ve set up the blog page on my website! You’ll be able to access this page at any time by going to ahl27.com/blog!
2 minute read
Published:
If you’ve been following all the posts in this series, you’ll know that by know I have a pretty good way to read in edges for ExoLabel due primarily to faster I/O and an optimized external sorting function (if you haven’t, check out the first post here!). I left off in my last post by mentioning that I wanted to change my external sort to work in-place, but I didn’t really explain why.
25 minute read
Published:
In my last post, I talked about ways to improve the efficiency of fwrite
calls. Essentially, it boils down to prioritizing sequential read/writes. At the end of that post, I mentioned that I need a way to sort a file on disk. These are called external sorting algorithms. While they’re not used very often today, they were incredibly important in the past. Back in the days of tape drives, computers rarely had enough RAM to load a dataset into memory for sorting, and so external sorts were used instead.
fwrite
so slow? 8 minute read
Published:
My last post talked about some of the things I discovered when looking into how to optimize my current research project, ExoLabel. Since then, I’ve made some big progress improvements in terms of speed, and I thought it would be worth it to break them down. Building efficient external memory algorithms is a really cool process; every potential inefficiency mattters a lot more than with RAM-based algorithms, so you start to understand how the computer works at a deeper level.
fseek
15 minute read
Published:
I’m working a lot with files for my latest research project, ExoLabel. I’ve (mostly) finished ensuring that the algorithm itself is accurate, so lately I’ve been turning my attention to optimizing its speed. Unsurprisingly, most of the slowest operations are working with files, since accessing files is orders of magnitude slower than accessing RAM.