Category: tech

Letter to an incoming CS Undergrad

Dear Jiana,

I heard from your mother you are enrolling as a Computer Science major in undergraduate.

First of all, I want to congratulate you on successfully getting to college. Though it might seem “everyone goes to college these days”, that does not diminish your achievement in the least. Comparisons matter, but by definition only relatively. The work you put in through 12 years of schooling and then college application and everything else were tasks given to you as a child explicitly or implicitly. Maybe you did or did not like them, but what matters is you saw things through to completion. So again, congratulations.

Second, I want to welcome you to the field of computers. It’s a friendly field; the hacker ethos means there is always someone willing to reach out and help – as long as you put in the work first ;). It’s also, very surprisingly, very accessible. Programmers like nothing else than to extoll and trumpet their works; fortunately for programmers, they also invented the internet. You will soon hear and see and meet many many bright, talented and industrious persons in this space that you can learn from. Have fun making new friends!

Third, it’s alright to take a while to being “good”. Maybe you won’t even want to be a good programmer. But if you do, it takes time. There’s no instant cheat code. The only cheat codes I know are to study a lot, do side projects a lot, meet and follow interesting people and see what they’re doing alot (find “Hacker News” and make it your daily ritual to skim through the headlines). Admittedly, I’m not very good or was very late at doing any of those, but maybe you can make use of it. There’s no cheat code to being “good”, work hard!

Fourth, I think the aspect that makes programmers fall in love with programming is the freedom. With software, you have freedom to do almost anything. If you can think of it, it can be done. I’ve been doing this for 5 years now (including time in college), I don’t think I got it until this year, so don’t fret if you don’t get it right away. The freedom to do what I want is honestly intoxicating. I am only limited by my thoughts and transferring them to my fingers. I hope you will find that freedom as well.

Finally, it’s ok to switch direction. I was in chemical engineering for 2 years in college before I landed on computers. At first, I thought CS would be my minor. That intro class got me hooked and I went all in. Maybe for you it would be the other way, you don’t like computers at all, you hate looking at screens all day, your posture has gone bad and your eyes hurt, you just don’t enjoy it the way others seem to. That’s fine. Don’t make decisions you feel you will regret later. Do things because they make sense to you and your priorities. Be careful of sunk-cost fallacy. People’s advice are just that, advice. Remember that it’s your life and your future. At 18, you became an adult legally, that comes freedom and responsibility to yourself. Lookout for yourself!

Finally, the only concrete advice I’ll give you is going to be in this paragraph. get a Mac computer or install Linux on your machine and know that Windows sucks. Use the command line. Put all your homework and notes and diaries and projects on GitHub or something similar, even if they’re only private to you. Use the command line. Self-marketing doesn’t have to be icky, think of it as “increasing the surface area for luck to land on”; or, in other words, start writing a blog and share. Use the command line. Protect your eyes, I suggest doing something physical once every two days at least. Use the command line. Slow is smooth, smooth is fast, and we must be as fast as possible because it’s always better to be fast; so learn how to type faster, learn how to read faster, learn how to learn faster. Use the command line. Read Hacker News. Use the command line.

Good luck and hack on!

from your mother’s colleague,

Viet Than

the database discovery

This is probably my most interesting story so far at this job. No lie, I really did discover a database in production that no one else knew existed.

It starts when Kobi, AppCard’s Operations Director, approached me one day and say, “Hey Viet, can you look into why one of our jbrains wasn’t backed up?”.

the data recovery

Many developers will have done this, some probably do this as a daily routine, but a recent work of mine on a data recovery job felt like a latest expression of my career’s progress so far.

line goes up

Crypto and its problems

“Line Goes Up – The Problem With NFTs” - Folding Ideas

and

[M]arkets are distributed systems.

Even though there are, in fact, very strict regulators and regulations, I can still enter into a contract with you without ever telling anyone. I can buy something from you, in cash, and nobody needs to know. (Tax authorities merely want to know, and anyway, notifying them is asynchronous and lossy.) Prices are set through peer-to-peer negotiation and supply and demand, almost automatically, through what some call an “invisible hand.” It’s really neat.

As long as we’re in the continuous control region.

As long as the regulators are doing their job.

Here’s what everyone peddling the new trendy systems is so desperately trying to forget, that makes all of them absurdly expensive and destined to fail, even if the things we want from them are beautiful and desirable and well worth working on. Here is the very bad news:

Regulation is a centralized function.

The job of regulation is to stop distributed systems from going awry.

Because distributed systems always go awry

I find myself linking to this article way too much lately, but here it is again: The Tyranny of Structurelessness by Jo Freeman. You should read it. The summary is that in any system, if you don’t have an explicit hierarchy, then you have an implicit one.

Despite my ongoing best efforts, I have never seen any exception to this rule.

Even the fanciest pantsed distributed databases, with all the Rafts and Paxoses and red/greens and active/passives and Byzantine generals and dining philosophers and CAP theorems, are subject to this. You can do a bunch of math to absolutely prove beyond a shadow of a doubt that your database is completely distributed and has no single points of failure. There are papers that do this. You can do it too. Go ahead. I’ll wait.

<several PhDs later>

Okay, great. Now skip paying your AWS bill for a few months.

Whoops, there’s a hierarchy after all!

Tô Minh Sơn’s comment on apenwarr:

“Men prefer to will nothingness than to not will”

and also pointed me to chainalysis’s ranking of crypto adoption with Vietnam being on top

you are not the tool

he is not the tool, he is the developer of the tools

  • Amichay Oren

niche tech in film

I just want to post about this sexy beast that is currently situated at the Mono No Aware film lab in Brooklyn, New York. Let me try colorfully recount what Steve Cossman, Mono’s director, tells me:

This is 1 of 18 machines in the world. The hardware is handbuilt by one guy and the software is handbuilt by another. Its full cost is $250,000 but they made one at $30,000 for Mono. It’s got 32TB of hard drive temporarily as the guy will come next week to upgrade that. It’s hooked up to a Windows PC that host the processing software, export to the data tower and we’ve got a Mac hooked up to that for ease of data transport. Scans 8 frame a second at 4K resolution. We drove it to the lab in the middle of a snow squall, and I have to thank a cinematography.com guru for helping set it up for us.

what a sexy scanner
Xena control module

That’s it, just niche tech that most will not get to see. Unless they come to Mono No Aware.

scaling with openvpn

You know your company is growing if your openvpn --max-client limit suddenly needs to be made bigger than the default 1024 or else the OpenVPN server suddenly dies and everyone thought it’s a firewall issue.

how do you database?

At my previous job, govtech/tax-tech, the database was just as important as the code. Now what do I mean by that? Mooney explained it best on this exact topic:

Given how much thought and effort goes into source code control and change management at many of these same companies, it is confusing and a little unsettling that so much less progress has been made on the database change management front. Many developers can give you a 15 minute explanation of their source code strategy, why they are doing certain things and referencing books and blog posts to support their approach, but when it comes to database changes it is usually just an ad-hoc system that has evolved over time and everyone is a little bit ashamed of it.

I believe the quote above is true. Admittedly, I’m only a 1+ YOE software engineer, but having jumped ship from a govtech consultancy to a startup, I find there is a lot to compare between how databases is treated and how this leads to a better developer experience.

What follows is a series of features I found missing at my current place of work.

1. Version Control

Schemas have version control. The system detects any changes made to the schema (In fact, the company never taught you to alter tables with SQL) because you would make table structure changes through the system. Deleting/removing columns, adding or editing comments, addibg/editing indexes (and probably many more), local changes are “synced” with the shared work server, where it assigns a version number for your structure. Migrations from local to testing environments and then, ultimately, to Prod, is simply having the environment point to the right version.

nothing else to be done

After 25 years of career, I still have to see an organization where things are so perfect that:

  • no refactoring is needed
  • no additional documentation is useful, it’s all there shiny and beautiful. And it updates itself nightly.
  • logging/monitoring/diagnostic tools are perfect
  • builds are so fast that you wonder if you did press enter
  • all necessary linters are configured and used
  • everything has unit tests
  • and integration tests
  • and there’s enough time for exploring alternative technologies for future development
  • and enough time for contributing feature/fixes upstream for the open source things you use
  • and you cannot build tools to answer asks from customers even faster So yes, you may not get official tickets assigned to you, but it doesn’t mean there’s nothing else to be done. Perceiving that need is the first step for moving from junior to more senior role, acting on that need is the second step.

Now, depending on the country you’re in, social norms may make you unpopular among co-workers and managers alike if you move too much, so there’s that.

Beautiful lessons by /u/mavvam

It looks like a product but is secretly a subscription

https://calpaterson.com/printers.html

microsoft acquires activision blizzard

It’s wild how Microsoft has been able to vertically integrate gaming. They now own the distribution (Xbox Cloud Gaming, Xbox Game Pass), the games (Call of Duty, WoW, Starcraft + what they owned before), the OS (Windows, Xbox), the hardware (Xbox, many PCs), and the back end compute (Azure). The only thing they’re missing, the network bandwidth, is mostly a commodity anyway.

  • curiousllama (HackerNews)

setting personal blog on work machine

Ok, I know this is probably not the best idea but I was following this:

How to manage multiple GitHub accounts on a single machine with SSH keys

today sucks

But yes, this works!

copenhagen interpretation of ethics

Current situation:

So let me get this straight… Up until now, Amazon/AWS hasn’t donated anything to cURL. And noone hated them for it.

Suddenly when someone - probably some manager with a limit of $5000 on donations - pushes through a donation of $5000, everyone hates them? Are you serious? If I was a cURL developer this would absolutely make my day.

When you develop OSS (Open Source Software), you aren’t doing it for the money, you don’t even know whether anyone is going to be using your software. And sure, you could limit the license so that big corps have to pay, but because that’d become a legal nightmare for them quickly, they (and probably by extension everyone else as well) will just skip your software and use or make something else.

So you make it copyleft or fully open, and then thank for donations no matter their size. A shitton of OSS devs don’t get any donations.

Philosophical followup:

Copenhagen Interpretation of Ethics

bits between the bits

CppCon 2018: Matt Godbolt “The Bits Between the Bits: How We Get to main()”

I was reminded of this video because of my fintech interviews (who love C++). Matt Godbolt is very famous in the C++ community, of course, because of his website godbolt.org. And it’s talk like this that shows why some people think C++ programmers are wizards, or insane, to work with such a language.

fintech 1 interview

I got through the online assessment and first round interview with a NY fintech company. Here is the reflection I have on the two parts. A learning experience that’s for sure.

social graph

Eugene Wei’s “And You Will Know Us by the Company We Keep”

In my three pieces on TikTok, I wrote about how that app’s architecture is fundamentally different from that of most Western social media. TikTok doesn’t need you to follow any accounts to construct a relevant feed for you. Instead, it does two things.

First, it tries to understand what interests you by observing how you react to everything it shows you. It tries to learn your taste, and it does a damn good job of it. TikTok is an interest graph built as an interest graph.

Secondly, TikTok runs every candidate video through a two-stage screening process. First, it runs videos through one of the most terrifying, vicious quality filters known to man: a panel of a few hundred largely Gen Z users. Okay, yes, that’s not quite right. Anyone can be on this test audience for a video. It just happens, however, that TikTok’s user base skews younger, so most of the people on that panel will be Gen Z. Also, it’s a known fact that a pack of Gen Z users muttering “OK Boomer” is the most terrifying pack hunter in the animal kingdom after hyenas and murder hornets. If those test viewers don’t show any interest, the video is yeeted into the dustbin of TikTok, never to be seen again except if someone seeks it out directly on someone’s profile.

Secondly, it then uses its algorithm to decide whether that video would interest each user based on their taste profile. Even if you don’t follow the creator of a video, if TikTok’s algorithm thinks you’ll enjoy it, you’ll see it in your For You Page.

Recently, Instagram announced it would start showing its users posts from accounts they don’t follow. In many ways, this is as close to a concession as we’ll see from Instagram to the superiority of TikTok’s architecture for pure entertainment. …

bombed interview 1

I bombed an interview on Friday. Here is the reflection on it.

thank you old coworker

I had a coworker who left, sad. But the good news is that means his now defunct account (on Teams) can be used so I can transfer links/reading I did on the work computer to outside.

I think these are all tech stuff.

Will write these up later on their owns posts. So thanks Taesan!

  1. https://rachelbythebay.com/w/2011/06/01/megaraid/

  2. https://ravimohan.blogspot.com/2007/04/learning-from-sudoku-solvers.html

  3. http://norvig.com/21-days.html

  4. https://www.biteinteractive.com/picturing-git-conceptions-and-misconceptions/

  5. https://mangadex.dev/mangadex-v5-infrastructure-overview/

  6. https://news.ycombinator.com/item?id=28440742

  7. https://news.ycombinator.com/item?id=28443625

  8. https://news.ycombinator.com/item?id=28446761

JavaScript quirks

Jim Cowart’s excellent 7 JavaScript quirks

m1 linux driver

So the M1 IOMMU driver was just merged by Linus Torvald into the Linux kernel.

The best part is you can see the development being streamed on YouTube.

redundant data

A clever extension of this idea was introduced in C-Store and adopted in the commercial data warehouse Vertica. Different queries benefit from different sort orders, so why not store the same data sorted in several different ways? Data needs to be replicated to multiple machines anyway, so that you don’t lose data if one machine fails. You might as well store that redundant data sorted in different ways so that when you’re processing a query, you can use the version that best fits the query pattern.

  • Martin Klepmann, Design Data-Intensive Applications

compiler magic

Magic is here, it is at your fingertips.

Recently, I’ve come across a not so efficient implementation of a isEven function

bool isEven(int number)
{
    int numberCompare = 0;
    bool even = true;

    while (number != numberCompare)
    {
        even = !even;
        numberCompare++;
    }
    return even;
}

… Surprisingly, Clang/LLVM is able to optimize the iterative algorithm down to the constant time algorithm (GCC fails on this one). In Clang 10 with full optimizations, this code compiles down to:

; Function Attrs: norecurse nounwind readnone ssp uwtable
define zeroext i1 @_Z6isEveni(i32 %0) local_unnamed_addr #0 {
  %2 = and i32 %0, 1
  %3 = icmp eq i32 %2, 0
  ret i1 %3
}

^ That, is magic

5 interview questions: C++, Area 2

reading reddit netsec

I browsed the top of /r/netsec yesterday night. I am in awe really. There is…

….the work of Sam Curry hacking Apple for 3 months with his team and submitted 55 vulnerabilities to get a total of over 280k in bounty payout.

…and this Quora answer on Stuxnet as the most sophisticated software ever written by John Byrd (of Sega Dreamcast fame)

…or this junior high schooler breaking GitHub private pages in the spare time of Covid and got some nice pocket money.

…or even the write up by Troy Hunt on another data breach being included in the pwnd database.

This world is amazing. This could be the closest I’ll be to a ninja spy. lol.

a glimpse of the future

A participant in OpenAI’s Codex invitational recently posted on HackerNews an unlisted video of how the DOM is manipulated by Codex with natural language text.

I’m reminded of this quote:

” Once technology rolls over you, if you’re not part of the steamroller, you’re part of the road. “ – Stewart Brand

I don’t think people realize just how much the future is already here (psssst, the guy in the bottom is not real as well).

career framework

A month ago, Dropbox released their career framework (“how do we determine you get promoted/raise?”).

glue

Tanya Reilly has an excellent talk (and transcribed slides) called Being Glue that perfectly captures this effect. In her words: “Glue work is expected when you’re senior… and risky when you’re not.”

What she calls glue work, I’m going to call systems design. They’re two sides of the same issue. Humans are the most unruly systems of all, and yet, amazingly, they follow many of the same patterns as other systems.

People who are naturally excellent at glue work often stall out early in the prescribed engineering pipeline, even when they’d be great in later stages (staff engineers, directors, and executives) that traditional engineers struggle at. In fact, it’s well documented that an executive in a tech company requires almost a totally different skill set than a programmer, and rising through the ranks doesn’t prepare you for that job at all. Many big tech companies hire executives from outside the company, and sometimes even from outside their own industry, for that reason.

  • Apenwarr, Systems design explains the world: volume 1

glue

Is COBOL holding you hostage with Math?

So a certain country blocks Medium so I’m recreating it here

Author: Marianne Bellotti Jul 28, 2018 · 12 min read

Face it: nobody likes fractions, not even computers.

When we talk about COBOL the first question on everyone’s mind is always Why are we still using it in so many critical places? Banks are still running COBOL, close to 7% of the GDP is dependent on COBOL in the form of payments from the Centers for Medicare & Medicaid Services, The IRS famously still uses COBOL, airlines still use COBOL (Adam Fletcher dropped my favorite fun fact on this topic in his Systems We Love talk: the reservation number on your ticket used to be just a pointer), lots of critical infrastructure both in the private and public sector still runs on COBOL.

Why?

5 interview questions

It’s not 5 interview questions, it’s 5 categories of interview questions.

Goal: do these all in C++, Java, and Rust

Steve Yegge influenced a lot of people with this post.

jersey style

The lesson to be learned from this is that it is often undesirable to go for the right thing first. It is better to get half of the right thing available so that it spreads like a virus. Once people are hooked on it, take the time to improve it to 90% of the right thing.

A wrong lesson is to take the parable literally and to conclude that C is the right vehicle for AI software. The 50% solution has to be basically right, and in this case it isn’t.

From the apocryphal The Rise of “Worse is Better” by Richard Gabriel.

penetration testing

Adventures in smart buttplug penetration testing

These guys know how to draw in a crowd, that’s for sure.

HackerNews comments

More can be found with danluu’s archive of HN comments

For those who work inside Google, it’s well worth it to look at Jeff & Sanjay’s commit history and code review dashboard. They aren’t actually all that much more productive in terms of code written than a decent SWE3 who knows his codebase.

apenwarr

Blog of Avery Pennarun at https://apenwarr.ca/log/

This guy was a L7 at Google. His best is his post on bug triage and system design so far.

I can’t quite understand his IPv6 posts yet.

Wish I can get into TailScale (his new company).