Skip to content

Blog

Total Engineering

You've heard the expression "total war"; it's pretty common throughout human history. Every generation or so, some gasbag likes to spout about how his people have declared "total war" against an enemy, meaning that every man, woman, and child within his nation was committing every second of their lives to victory. That is bullshit on two basic levels. First of all, no country or group is ever 100 percent committed to war; it's just not physically possible. You can have a high percentage, so many people working so hard for so long, but all of the people, all of the time? What about the malingerers, or the conscientious objectors? What about the sick, the injured, the very old, the very young? What about when you're sleeping, eating, taking a shower, or taking a dump? Is that a "dump for victory"? That's the first reason total war is impossible for humans. The second is that all nations have their limits. There might be individuals within that group who are willing to sacrifice their lives; it might even be a relatively high number for the population, but that population as a whole will eventually reach its maximum emotional and physiological breaking point. The Japanese reached theirs with a couple of American atomic bombs. The Vietnamese might have reached theirs if we'd dropped a couple more, but, thank all holy Christ, our will broke before it came to that. That is the nature of human warfare, two sides trying to push the other past its limit of endurance, and no matter how much we like to talk about total war, that limit is always there…unless you're the living dead. For the first time in history, we faced an enemy that was actively waging total war. They had no limits of endurance. They would never negotiate, never surrender. They would fight until the very end because, unlike us, every single one of them, every second of every day, was devoted to consuming all life on Earth. That's the kind of enemy that was waiting for us beyond the Rockies. That's the kind of war we had to fight.

— World War Z: An Oral History of the Zombie War, Max Brooks


This was going to be a long post, but nothing I wrote makes much sense to me. The general gist is that AI reduces the cognitive cost of "engineering", of sitting down and do research or cost-benefit analysis, etc. "Winging it" was the rational choice, because the cost of doing better was higher than the expected value of the improvement. This has changed because of AI. And this is happening across all level of humanity: from the individual to the group (company, state/society, ideology/religion, etc.).

Everyone has their own story of how AI helps them with something. AI before healthcare visits, AI for "digital afterlife", AI companions, AI to rehearse hard conversations, etc. etc. etc.

Companies are throwing capital (human and financial) at documentation, which is a few steps remove of having "company as code", which is a few step remove from the theoretical one-person unicorn. But any industry or job that has a computer will be touched by AI.

Government AI usage I don't know much about, but I can only imagine that it's being used in a similar way to companies, but with more focus on public services and policy making. And other, more worrying, use cases like surveillance, propaganda, and warfare.

Now Pope Leo XIV already came out with Magnifica Humanitas, but some Korean Buddhists adopt their first Buddhist Monk. Could some day we have an official Christian AI accessible to everyone at all times? you can already go and converse with any notable human in history.

The point is that AI is going to transform everything.

Yet, this technology, this means of liberating the mind from from "cognitive load", is not free like the internet (yet?) and it's not open source (yet? kinda? Linux-of-AI when?) and it can be taken away from you.

So if we know that everything about life is going to be "engineered" by AI, the question staring down at all of us like the barrel of a gun is: Who owns the AI that is doing the engineering?

Betrayal is insidious

in·sid·i·ous — /inˈsidēəs/ — adjective

proceeding in a gradual, subtle way, but with harmful effects.

I've been thinking about betrayal, and how its after-effects spreads silently, hampering people, and teams, and projects, and partnerships, ultimately leaving a trail of just extra work and high-intensity emotions for both sides.

Specifically, the story goes like this. There is smaller company that is entering into a business partnership with a much bigger company. The bigger company only work with the small company because the small company already has a long-time partnership with a significant mutual client. At first, things went well, the big company just integrate with the smaller company's technology. People from both companies get to know each other a little bit. It's still business, but it's not uncomfortable.

That project went great, so executives from all sides decide to take things up a notch. This time, for this new project, the smaller company is going to create some custom work that the bigger company is going to integrate with, all under the eyes and expectation of the mutual client.

There were Slack channels created, shared Google docs, weekly meetings scheduled, statement-of-works signed. And like all technology projects, deadlines just seems to creep up on both sides. Soon, there were scores of Slack threads and email threads for engineers from both sides to interface on the inevitable quirks of the system being built. And it's not like everyone has communication hygiene, multiple questions are asked in one go, the other side would provide some answers the same thread, some were answered in totally different thread, maybe some are replied in a different medium or over video meets, etc.

The "betrayal" came when the project manager of the bigger company came to a meeting, in front of the mutual client — probably to either cover their ass or backed by their executive — with a Google Sheet they compiled of all the "unresponded questions" that the smaller company has not responded to. Were the bigger company's PM technically correct? Sure. Were they right? Absolutely not. The smaller company's executive was baffled, the project manager wasn't prepared and couldn't give a good defense, the engineers (which weren't in the room) had to scramble and play catchup after the fact, going back over weeks of conversation to compile when and where answers were provided.

Did business goes on as usual? Of course. the smaller company stuck to the statement of work after that. Were appropriate information relayed in backchannels to the mutual client? Sure, they still like the smaller company afterall. Did communication slow down by alot? Absolutely, for now the smaller company is also always going to be in "cya" mode.

Thoughts about routine

I've recently gotten into running. It was slow progress. I skipped most days. But every week I do go run once. I think it does help my mood. Early on, I definitely had the New Year Resolution feel where I almost ran every 2 days. But anyway, what's important is that I'm still barely keeping up with it. I've slowly introduced other things. I started doing dead hangs and active hangs, just 20-30 seconds. I then started lateral raises because I heard it's good for my physique. Then I now do standing tricep dumbbell extensions (?). I am totally not doing exercise in any scientific way, but I think it's about habit building a bit.

It is very easy to lose your routine. I like to blame others, but at the end of the day, it's me who needs to keep it. What I really need is just 150 minutes of exercises a week (2h30m/w)

I feel frustrated because routine and good habits doesn't come easy. I also wishes for the positive effects exercise would have on me (physiological, mental, attractiveness, etc.), but why must good things be so hard?

This is just a rant, but it's about making sure I keep going.

Letter to an incoming CS Undergrad

Dear Jiana,

I heard from your mother you are enrolling as a Computer Science major in undergraduate.

First of all, I want to congratulate you on successfully getting to college. Though it might seem "everyone goes to college these days", that does not diminish your achievement in the least. Comparisons matter, but by definition only relatively. The work you put in through 12 years of schooling and then college application and everything else were tasks given to you as a child explicitly or implicitly. Maybe you did or did not like them, but what matters is you saw things through to completion. So again, congratulations.

Second, I want to welcome you to the field of computers. It's a friendly field; the hacker ethos means there is always someone willing to reach out and help -- as long as you put in the work first ;). It's also, very surprisingly, very accessible. Programmers like nothing else than to extoll and trumpet their works; fortunately for programmers, they also invented the internet. You will soon hear and see and meet many many bright, talented and industrious persons in this space that you can learn from. Have fun making new friends!

Third, it's alright to take a while to being "good". Maybe you won't even want to be a good programmer. But if you do, it takes time. There's no instant cheat code. The only cheat codes I know are to study a lot, do side projects a lot, meet and follow interesting people and see what they're doing alot (find "Hacker News" and make it your daily ritual to skim through the headlines). Admittedly, I'm not very good or was very late at doing any of those, but maybe you can make use of it. There's no cheat code to being "good", work hard!

Fourth, I think the aspect that makes programmers fall in love with programming is the freedom. With software, you have freedom to do almost anything. If you can think of it, it can be done. I've been doing this for 5 years now (including time in college), I don't think I got it until this year, so don't fret if you don't get it right away. The freedom to do what I want is honestly intoxicating. I am only limited by my thoughts and transferring them to my fingers. I hope you will find that freedom as well.

Finally, it's ok to switch direction. I was in chemical engineering for 2 years in college before I landed on computers. At first, I thought CS would be my minor. That intro class got me hooked and I went all in. Maybe for you it would be the other way, you don't like computers at all, you hate looking at screens all day, your posture has gone bad and your eyes hurt, you just don't enjoy it the way others seem to. That's fine. Don't make decisions you feel you will regret later. Do things because they make sense to you and your priorities. Be careful of sunk-cost fallacy. People's advice are just that, advice. Remember that it's your life and your future. At 18, you became an adult legally, that comes freedom and responsibility to yourself. Lookout for yourself!

Finally, the only concrete advice I'll give you is going to be in this paragraph. get a Mac computer or install Linux on your machine and know that Windows sucks. Use the command line. Put all your homework and notes and diaries and projects on GitHub or something similar, even if they're only private to you. Use the command line. Self-marketing doesn't have to be icky, think of it as "increasing the surface area for luck to land on"; or, in other words, start writing a blog and share. Use the command line. Protect your eyes, I suggest doing something physical once every two days at least. Use the command line. Slow is smooth, smooth is fast, and we must be as fast as possible because it's always better to be fast; so learn how to type faster, learn how to read faster, learn how to learn faster. Use the command line. Read Hacker News. Use the command line.

Good luck and hack on!

from your mother's colleague,

Viet Than

obsession

Books are my obsession. The finishing of a story is a grinding pressure on me when I start on every journey. It doesn't take long to know if you'll enjoy a story or not, just a few chapters. Before you know it, a vortex has pulled you into the depths. Sometimes I think it's an incredibly toxic and unproductive thing. I probably am squandering away a lot of potential. But whenever I cast my gaze into the past, I can't help but think fondly of those nights. Those were the most sublime of personal freedom, moments when I really can let go of anything else in the world.

the database discovery

This is probably my most interesting story so far at this job. No lie, I really did discover a database in production that no one else knew existed.

It starts when Kobi, AppCard's Operations Director, approached me one day and say, "Hey Viet, can you look into why one of our jbrains wasn't backed up?".

For context, jbrains are the on-prem devices AppCard deploys to our customers (the grocery stores). These brains sits in the grocery store's network and communicates with the various Point-of-Sales devices to administer coupons, loyalty system etc. The Jbrain is highly configurable, as each grocers have different needs and integrations.

I knew the jbrain's "backups" are really just daily copies of these configuration files, stored in a server on AWS (we will leave aside the question of why not S3). With these files, a jbrain replacement can be "built" with the same configurations if there are hardware failures or the likes.

After confirming that it looks like the jbrain in question has no backups, and actually there are other jbrains that are missing their backups too, the only suspects is a bug in the backup process or a bug with the backup server. Now I know this backup server, the tech support guys and I use it everyday to do our work, but it's a holy mess of scripts created by half a dozen sysadmins that I never got a knowledge transfer on, we can't start our search there. Howabout the process? Do we know how the backup process work? Of course we don't. And just as obviously, the guys that actually built it are long gone and didn't leave behind any documentation on both the process and the server. The only clue I had was someone mentioning: "I think it's scheduled to run daily at 1 or 2AM or something".

Now that could mean anything, but to me, that sounded like a crontab. At the very least, I hope the crontab exists on the backups server, and not some other server, cause oh boy do we have a lot of servers (as an aside, this monstrosity of complexity is being worked on, with no end in sight). I was able to find a way to output every possible cronjob (users, cron directories), and nestled in all those jobs was one labeled "daily jbrain backups". Aha!

But wait, that backup script is in perl. I didn't know perl, but I had the spirit of all engineers in that we know we can figure anything out. It's actually quite an intuitive language. And all you really need to know how to debug is the ability to print to stdout.

I quickly found that this backup perl script rely on a textfile with a list of stores to know what to backup. Grepping on that list, we can see it is certainly missing many many stores. So what populates this file?

At this point I could do a combination of find/grep, but thankfully I noticed that this textfile is last modified on the dot at 11PM the previous day. Lol, crontab again it is. Scanning the crontab output from the previous section, and what do you know, another perl script.

This time, I noticed something peculiar. The perl script started calling /usr/bin/mysql with some variables. Chasing down these variables leads to some env files. And at this point, I realized that it is calling a database that I didn't know about. This database wasn't in my training, it wasn't ever mentioned by the support engineers, it wasn't on the google sheet containing list of database maintained by the ex-database administrator, and it was not part of my knowledge transfer with the ex-system admin either.

I called Kobi and told him the situation, and then we simply shared a kind of chuckle reserved for situations of absurdity.

Back to work, I obviously started by logging into this MariaDB lost through time. There were only a few tables, nothing mind-blowing or anything. But combined with the perl script, I started tracing that perl script to see what it is doing with the database. And actually, once I figured out how to run the perl script, the error quickly became apparent as the result of an unhandled error by the script when it tries to insert rows into the database. For a moment, it was the developer happy debug loop of modify, run, read until eureka!

Anyway, what was the issue isn't important (it is fixed by now!)(there was missing ancillary data because new jbrains had a recent upgrade), but the discovery of the database is. This database, until we can move on from it, is a critical part of the company's infrastructure. The existence of this database, even mostly-unmanaged as it is now, changed how development for operations can move forward. We started documenting it. Though opportunities are few, future development did consider whether we can use that database. Once I figured out how to get myself superuser access, I even started adding new tables for my developmental needs.

Looking back, I think of this story as a fond discovery. The CTO was definitely pleased to hear about this find. And I think it's a lesson in how effective but forgotten scripts and software can quietly run for years until the day something breaks.

PS: We are starting to centralize the various perl and bash scripts across servers and versioning control them. Not forgetting those too!

the data recovery

Many developers will have done this, some probably do this as a daily routine, but a recent work of mine on a data recovery job felt like a latest expression of my career's progress so far.

The Problem

After being notified by some customers, AppCard discovered that a real-time SQS data queue provided by a third-party hasn’t provided real-time data in a while. Though we were able to quickly notify our third-party to bring that system back online, we still had an issue where approximately 4 days of data was missing and unprocessed.

{% include centerImage.html url="/assets/DataRecovery/not_my_problem.gif" desc="What I wanted us to say to them but they said this to us first" title="The 3rd-party didn't say this, but more like 'we don't want to deal with this'" alt="Jimmy Fallon on The Tonight Show saying 'This sounds more like a you problem'" %}

Based on business considerations, we decided that it would be best if we could recover the data without needing help from the third-party (instead of telling them that they should be doing this because after all it's their fault). When the integration lead hesitated to take on this responsibility due to allocation constraints, I volunteered to take on the challenge. There were two components I had to address before even committing (because free credits for customers are expensive but simple): 1. Is it possible to retrieve the data from the third-party's available API? 2. How long would it take to implement this?

{% include centerImage.html url="/assets/DataRecovery/give_the_money.gif" desc="How I imagine any average customer hearing about missing data" title="The greed of man is insatiable" alt="Scene from the show Friends where Phoebe grabs Ross then threateningly says 'Give me your money, Punk'" %}

The Fix

Firing up a jupyter notebook, I got to work. Quickly, I was able to confirm that with the right secrets pulled from the right place, and just reading the documentation, the third-party's API seems to be able to provide the data we need. ( We are leaving aside the question why we rely on an SQS instead of this API ;) ). Additionally, after quickly skimming through our integration subsystem, I was able to identify a location in the flow where the right data could be injected with the right dummy setup.

{% include centerImage.html url="/assets/DataRecovery/in_theory_possible.gif" desc="I was 70% sure I could do it" title="The line between confidence and arrogance is thin" alt="Some dude on a red couch saying 'In theory it's possible'" %}

Gauging my own speed of development, considering that realistically I only grasp maybe 60-70% of how to use the API or the integration subsystem, and adding some buffer, I estimated 2 days for implementation and 1 day to run the recovery process. I then presented my findings to the business and tech leads that afternoon, giving me the greenlight to go ahead.

{% include centerImage.html url="/assets/DataRecovery/you_got_this.gif" desc="I didn't include a few worrying discussions of possible side-effects" title="Bill Murray would make a great tech lead" alt="Bill Murray in a suit with left eyebrow raised while holding a wine glass on his left hand and pointing at the screen with his right hand at the viewer with caption 'You Got This'" %}

Our async infrastructure and integration is already built on the Python framework Celery, convenient grounds for this one-off development. The simple overview of the job is that it would pull data for 100 transactions at time, process it, and repeat until it hits a transaction outside the 4 day gap. I made sure to provide sufficient optional parameterization in case I needed to restart the jobs if it fail or stop unexpectedly. Since we can only deploy once a day, it would be better for a struggling but kept-running process than having to wait for the following day to fix the code and start over. This also meant an almost excessive amount of loggings, so as to have an intimate visibility on how the recovery task is going, and provide the necessary parameters if the job needed restarting.

{% include centerImage.html url="/assets/DataRecovery/laying_train_tracks.gif" desc="Conceptual visual of my architecture" title="I'm Gromit" alt="The beagle Gromit from the series Wallace and Gromit riding a toy train and laying down the train tracks for that toy train as fast as he can so he won't crash" %}

Once I felt comfortable, we had a pre-production environment that I made sure to test out my task. But admittedly our pre-production data is very different from real production. There were immediate hiccups once this was merged in production, one of our assumptions turned out to be incorrect and sometimes the async job didn’t automatically repeat even though there was more data in the gap to query. Thankfully, I could manually re-trigger the jobs with the right parameters because of the logging. This meant more human intervention but still allowed the job to finish.

{% include centerImage.html url="/assets/DataRecovery/phew.gif" desc="I didn't do this cause I was sitting next to the business, but I was this internally" title="A lot of internal self-praise" alt="Some guy wiping his brow" %}

The Conclusion

In the end, almost all our customers didn’t even notice the data gap. Shoppers got their points and we didn't need to give anyone any extra credit. My teammates could focus on other tasks while I proved to myself that I can sovle vague and unknown problems by myself. This mini-project was well-delivered, well-scheduled, and had real immediate business impact on the bottom line. Coming home that day, I felt like I earned my paycheck.

{% include centerImage.html url="/assets/DataRecovery/honest_work.jpg" desc="Professional pride feels good" title="Couldn't find the gif for this" alt="The meme with the farmer and caption 'It ain't much, but it's honest work'" %}

experimental films with Linh p1

I had nowhere to go by Douglas Gordon is an audio experience where I wince as a man continually hack a beet dangerously close to his fingers and have a staring contest with a chimpanzee. Also, entirely carried by the tales of Jonas Mekas's escape from war-torn Europe and his early years in New York City.

binge

About 4-5 weeks ago, it started with Legend of the Northern Blade, well really it starts with /r/manga. Now, of course that little few chapters won't get me anywhere, so next came Red Storm cause it's martial arts, and ki, and that was oh so so exciting! But the ending was a little weird, and the mentor character was interesting but not explained, thankfully there is Peerless Dad which is set in the same universe. That was Korean, and martial arts excitement hasn't ended, of course we've got to at least check out Gosu S2 a little bit. At this point, I am a little bored with manga/pictures, so we have to swtich to webnovels of course. The first one was Omniscient Reader's Viewpoint, but that was dark and depressing by the end. Not like horrifying, more like gut-wrenching in how people would sacrifice themselves for others. I clearly needed something way lighter to cleanse my palate. That's when I stumbled on A Stay-at-home Dad's Restaurant In An Alternate World, that was a light, easy read. But very frustratingly, it didn't have an ending. Can you understand that? like reaching for the "NEXT" button but nothing's there. It was close to finishing the thirst. I took the plunge and browsed Completed on b o x n o v e l. Finally, I landed on Rebirth to a Military Marriage: Good Morning Chief. I felt complete, I felt really complete then. It was an amazing novel. An amazing amazing story.

PS: Wed Jun 1 2022, lol, let's also add The Legendary Mechanic

PPS: Sun Jun 26 2022, ok, let's really really end this binge-ing with Tales of Herding Gods. Which is so good I would re-read and do a YouTube channel on it if I could.