Skip to content

2022

the database discovery

This is probably my most interesting story so far at this job. No lie, I really did discover a database in production that no one else knew existed.

It starts when Kobi, AppCard's Operations Director, approached me one day and say, "Hey Viet, can you look into why one of our jbrains wasn't backed up?".

For context, jbrains are the on-prem devices AppCard deploys to our customers (the grocery stores). These brains sits in the grocery store's network and communicates with the various Point-of-Sales devices to administer coupons, loyalty system etc. The Jbrain is highly configurable, as each grocers have different needs and integrations.

I knew the jbrain's "backups" are really just daily copies of these configuration files, stored in a server on AWS (we will leave aside the question of why not S3). With these files, a jbrain replacement can be "built" with the same configurations if there are hardware failures or the likes.

After confirming that it looks like the jbrain in question has no backups, and actually there are other jbrains that are missing their backups too, the only suspects is a bug in the backup process or a bug with the backup server. Now I know this backup server, the tech support guys and I use it everyday to do our work, but it's a holy mess of scripts created by half a dozen sysadmins that I never got a knowledge transfer on, we can't start our search there. Howabout the process? Do we know how the backup process work? Of course we don't. And just as obviously, the guys that actually built it are long gone and didn't leave behind any documentation on both the process and the server. The only clue I had was someone mentioning: "I think it's scheduled to run daily at 1 or 2AM or something".

Now that could mean anything, but to me, that sounded like a crontab. At the very least, I hope the crontab exists on the backups server, and not some other server, cause oh boy do we have a lot of servers (as an aside, this monstrosity of complexity is being worked on, with no end in sight). I was able to find a way to output every possible cronjob (users, cron directories), and nestled in all those jobs was one labeled "daily jbrain backups". Aha!

But wait, that backup script is in perl. I didn't know perl, but I had the spirit of all engineers in that we know we can figure anything out. It's actually quite an intuitive language. And all you really need to know how to debug is the ability to print to stdout.

I quickly found that this backup perl script rely on a textfile with a list of stores to know what to backup. Grepping on that list, we can see it is certainly missing many many stores. So what populates this file?

At this point I could do a combination of find/grep, but thankfully I noticed that this textfile is last modified on the dot at 11PM the previous day. Lol, crontab again it is. Scanning the crontab output from the previous section, and what do you know, another perl script.

This time, I noticed something peculiar. The perl script started calling /usr/bin/mysql with some variables. Chasing down these variables leads to some env files. And at this point, I realized that it is calling a database that I didn't know about. This database wasn't in my training, it wasn't ever mentioned by the support engineers, it wasn't on the google sheet containing list of database maintained by the ex-database administrator, and it was not part of my knowledge transfer with the ex-system admin either.

I called Kobi and told him the situation, and then we simply shared a kind of chuckle reserved for situations of absurdity.

Back to work, I obviously started by logging into this MariaDB lost through time. There were only a few tables, nothing mind-blowing or anything. But combined with the perl script, I started tracing that perl script to see what it is doing with the database. And actually, once I figured out how to run the perl script, the error quickly became apparent as the result of an unhandled error by the script when it tries to insert rows into the database. For a moment, it was the developer happy debug loop of modify, run, read until eureka!

Anyway, what was the issue isn't important (it is fixed by now!)(there was missing ancillary data because new jbrains had a recent upgrade), but the discovery of the database is. This database, until we can move on from it, is a critical part of the company's infrastructure. The existence of this database, even mostly-unmanaged as it is now, changed how development for operations can move forward. We started documenting it. Though opportunities are few, future development did consider whether we can use that database. Once I figured out how to get myself superuser access, I even started adding new tables for my developmental needs.

Looking back, I think of this story as a fond discovery. The CTO was definitely pleased to hear about this find. And I think it's a lesson in how effective but forgotten scripts and software can quietly run for years until the day something breaks.

PS: We are starting to centralize the various perl and bash scripts across servers and versioning control them. Not forgetting those too!

the data recovery

Many developers will have done this, some probably do this as a daily routine, but a recent work of mine on a data recovery job felt like a latest expression of my career's progress so far.

The Problem

After being notified by some customers, AppCard discovered that a real-time SQS data queue provided by a third-party hasn’t provided real-time data in a while. Though we were able to quickly notify our third-party to bring that system back online, we still had an issue where approximately 4 days of data was missing and unprocessed.

{% include centerImage.html url="/assets/DataRecovery/not_my_problem.gif" desc="What I wanted us to say to them but they said this to us first" title="The 3rd-party didn't say this, but more like 'we don't want to deal with this'" alt="Jimmy Fallon on The Tonight Show saying 'This sounds more like a you problem'" %}

Based on business considerations, we decided that it would be best if we could recover the data without needing help from the third-party (instead of telling them that they should be doing this because after all it's their fault). When the integration lead hesitated to take on this responsibility due to allocation constraints, I volunteered to take on the challenge. There were two components I had to address before even committing (because free credits for customers are expensive but simple): 1. Is it possible to retrieve the data from the third-party's available API? 2. How long would it take to implement this?

{% include centerImage.html url="/assets/DataRecovery/give_the_money.gif" desc="How I imagine any average customer hearing about missing data" title="The greed of man is insatiable" alt="Scene from the show Friends where Phoebe grabs Ross then threateningly says 'Give me your money, Punk'" %}

The Fix

Firing up a jupyter notebook, I got to work. Quickly, I was able to confirm that with the right secrets pulled from the right place, and just reading the documentation, the third-party's API seems to be able to provide the data we need. ( We are leaving aside the question why we rely on an SQS instead of this API ;) ). Additionally, after quickly skimming through our integration subsystem, I was able to identify a location in the flow where the right data could be injected with the right dummy setup.

{% include centerImage.html url="/assets/DataRecovery/in_theory_possible.gif" desc="I was 70% sure I could do it" title="The line between confidence and arrogance is thin" alt="Some dude on a red couch saying 'In theory it's possible'" %}

Gauging my own speed of development, considering that realistically I only grasp maybe 60-70% of how to use the API or the integration subsystem, and adding some buffer, I estimated 2 days for implementation and 1 day to run the recovery process. I then presented my findings to the business and tech leads that afternoon, giving me the greenlight to go ahead.

{% include centerImage.html url="/assets/DataRecovery/you_got_this.gif" desc="I didn't include a few worrying discussions of possible side-effects" title="Bill Murray would make a great tech lead" alt="Bill Murray in a suit with left eyebrow raised while holding a wine glass on his left hand and pointing at the screen with his right hand at the viewer with caption 'You Got This'" %}

Our async infrastructure and integration is already built on the Python framework Celery, convenient grounds for this one-off development. The simple overview of the job is that it would pull data for 100 transactions at time, process it, and repeat until it hits a transaction outside the 4 day gap. I made sure to provide sufficient optional parameterization in case I needed to restart the jobs if it fail or stop unexpectedly. Since we can only deploy once a day, it would be better for a struggling but kept-running process than having to wait for the following day to fix the code and start over. This also meant an almost excessive amount of loggings, so as to have an intimate visibility on how the recovery task is going, and provide the necessary parameters if the job needed restarting.

{% include centerImage.html url="/assets/DataRecovery/laying_train_tracks.gif" desc="Conceptual visual of my architecture" title="I'm Gromit" alt="The beagle Gromit from the series Wallace and Gromit riding a toy train and laying down the train tracks for that toy train as fast as he can so he won't crash" %}

Once I felt comfortable, we had a pre-production environment that I made sure to test out my task. But admittedly our pre-production data is very different from real production. There were immediate hiccups once this was merged in production, one of our assumptions turned out to be incorrect and sometimes the async job didn’t automatically repeat even though there was more data in the gap to query. Thankfully, I could manually re-trigger the jobs with the right parameters because of the logging. This meant more human intervention but still allowed the job to finish.

{% include centerImage.html url="/assets/DataRecovery/phew.gif" desc="I didn't do this cause I was sitting next to the business, but I was this internally" title="A lot of internal self-praise" alt="Some guy wiping his brow" %}

The Conclusion

In the end, almost all our customers didn’t even notice the data gap. Shoppers got their points and we didn't need to give anyone any extra credit. My teammates could focus on other tasks while I proved to myself that I can sovle vague and unknown problems by myself. This mini-project was well-delivered, well-scheduled, and had real immediate business impact on the bottom line. Coming home that day, I felt like I earned my paycheck.

{% include centerImage.html url="/assets/DataRecovery/honest_work.jpg" desc="Professional pride feels good" title="Couldn't find the gif for this" alt="The meme with the farmer and caption 'It ain't much, but it's honest work'" %}

experimental films with Linh p1

I had nowhere to go by Douglas Gordon is an audio experience where I wince as a man continually hack a beet dangerously close to his fingers and have a staring contest with a chimpanzee. Also, entirely carried by the tales of Jonas Mekas's escape from war-torn Europe and his early years in New York City.

binge

About 4-5 weeks ago, it started with Legend of the Northern Blade, well really it starts with /r/manga. Now, of course that little few chapters won't get me anywhere, so next came Red Storm cause it's martial arts, and ki, and that was oh so so exciting! But the ending was a little weird, and the mentor character was interesting but not explained, thankfully there is Peerless Dad which is set in the same universe. That was Korean, and martial arts excitement hasn't ended, of course we've got to at least check out Gosu S2 a little bit. At this point, I am a little bored with manga/pictures, so we have to swtich to webnovels of course. The first one was Omniscient Reader's Viewpoint, but that was dark and depressing by the end. Not like horrifying, more like gut-wrenching in how people would sacrifice themselves for others. I clearly needed something way lighter to cleanse my palate. That's when I stumbled on A Stay-at-home Dad's Restaurant In An Alternate World, that was a light, easy read. But very frustratingly, it didn't have an ending. Can you understand that? like reaching for the "NEXT" button but nothing's there. It was close to finishing the thirst. I took the plunge and browsed Completed on b o x n o v e l. Finally, I landed on Rebirth to a Military Marriage: Good Morning Chief. I felt complete, I felt really complete then. It was an amazing novel. An amazing amazing story.

PS: Wed Jun 1 2022, lol, let's also add The Legendary Mechanic

PPS: Sun Jun 26 2022, ok, let's really really end this binge-ing with Tales of Herding Gods. Which is so good I would re-read and do a YouTube channel on it if I could.

line goes up

Crypto and its problems

"Line Goes Up – The Problem With NFTs" - Folding Ideas

and

[M]arkets are distributed systems.

Even though there are, in fact, very strict regulators and regulations, I can still enter into a contract with you without ever telling anyone. I can buy something from you, in cash, and nobody needs to know. (Tax authorities merely want to know, and anyway, notifying them is asynchronous and lossy.) Prices are set through peer-to-peer negotiation and supply and demand, almost automatically, through what some call an "invisible hand." It's really neat.

As long as we're in the continuous control region.

As long as the regulators are doing their job.

Here's what everyone peddling the new trendy systems is so desperately trying to forget, that makes all of them absurdly expensive and destined to fail, even if the things we want from them are beautiful and desirable and well worth working on. Here is the very bad news:

Regulation is a centralized function.

The job of regulation is to stop distributed systems from going awry.

Because distributed systems always go awry

...

I find myself linking to this article way too much lately, but here it is again: The Tyranny of Structurelessness by Jo Freeman. You should read it. The summary is that in any system, if you don't have an explicit hierarchy, then you have an implicit one.

Despite my ongoing best efforts, I have never seen any exception to this rule.

Even the fanciest pantsed distributed databases, with all the Rafts and Paxoses and red/greens and active/passives and Byzantine generals and dining philosophers and CAP theorems, are subject to this. You can do a bunch of math to absolutely prove beyond a shadow of a doubt that your database is completely distributed and has no single points of failure. There are papers that do this. You can do it too. Go ahead. I'll wait.

\

Okay, great. Now skip paying your AWS bill for a few months.

Whoops, there's a hierarchy after all!

Tô Minh Sơn's comment on apenwarr:

"Men prefer to will nothingness than to not will"

and also pointed me to chainalysis's ranking of crypto adoption with Vietnam being on top

conflict theorists

People who vie for power are locked in the most obvious zero sum game in existence, and so they're necessarily conflict theorists. People who have no lust for power themselves and try to explain the world in terms of mistake theory are basically forced to speculate that Power does not exist at all except as epiphenomenon of some poor coordination or whatever, or does not matter. It exists and matters a great deal, however, and shapes the way they live and think, and seeks to triumph over them ever harder. They just don't know yet.

/u/Ilforte

TheMotte pre-Ukraine

Lot's of IR stuff in this thread:

What I am wondering is whether our eagerness to expand NATO is having more drawbacks than benefits. Russia's weakness (which they are well aware of) is that because much of their land is currently tundra, the majority of their civilization is in the west, uncomfortably close to NATO. The Kremlin doesn't want NATO forces within "rapid striking distance" of Moscow, which I can totally sympathize with, because I wouldn't want Russian or CCP forces situated in Mexico. They've made it absolutely clear that this is a red line for them. I don't think they particularly want to invade the Ukraine, they just don't want the Ukraine to join NATO because they perceive that as a threat, and they're probably going to do whatever they have to in order to stop that threat

US Capitol attacks

Honorable Mentions:

Various duels and fights conducted in the Capitol or by Senators and Congressmen. Special plaudits go to: the duel in which Representative William J. Graves of Kentucky killed Representative Jonathan Cilley of Maine; the incident on February 6th 1858 in which a debate over the Kansas Territory grew into a fistfight that included over 30 Representatives; "The Battle of the Reed Rules," in which newly-elected Speaker Thomas Brackett Reed attempted to count Democrats in the chamber who were present but remaining silent to defy a quorum, after which Democrats attempted to flee before Reed had the doors ordered locked; the infamous Brooks-Sumner affair, when Preston Brooks of South Carolina beat Charles Sumner with a cane on the Senate Floor over a heated debate on slavery (which only ended when several Senators pulled pistols to restore order); and, less-famously, the caning in 1866, when Lovell Rousseau of Kentucky (a Union general during the war) caned Josiah Grinnell of Iowa, after which Rousseau was censured, resigned, and then re-elected handily in the same seat.

Honorable Mention: The Weather Underground

On March 1st, 1971, radical militant group "Weather Underground" successfully planted and detonated a bomb in one of the men's bathrooms. No one was injured, and no one was ever arrested or changed. Weather Underground leaders Bill Ayers and Bernadine Dohrn were later, famously, at the center of a controversy over how close they were to then-candidate Barack Obama.

Later, in 1983, the "May 19th Communist Organization," a feminist spin-off of the Weather Underground, would plant bombs twice in Capitol restrooms, failing to detonate one on November 6th but succeeding to detonate one on November 7th. Nobody was hurt, 7 people were charged, 2 were sentenced, and one would eventually have her sentence commuted by President Clinton.

...

At first, it looked as if neighboring Virginia would remain in the Union. When it unexpectedly voted for secession, there was a serious danger that the divided state of Maryland would do the same, which would totally surround the capital with enemy states. President Abraham Lincoln’s act in jailing Maryland's pro-slavery leaders without trial saved the capital from that fate.

Faced with an open rebellion that had turned hostile, Lincoln began organizing a military force to protect Washington. The Confederates desired to occupy Washington and massed to take it. On April 10 forces began to trickle into the city. On April 19, the Baltimore riot threatened the arrival of further reinforcements. Andrew Carnegie led the building of a railroad that circumvented Baltimore, allowing soldiers to arrive on April 25, thereby saving the capital.

Wikipedia rather understates the danger. After the incident at Fort Sumter, when the seceded state of South Carolina bombarded the federal garrison there, Virginia voted to secede from the Union, and DC found itself at risk of being totally isolated and captured without any defenses. Lincoln passed a very sleepless week wondering if the capitol was about to be occupied any moment, and was only relieved when the first troops of his 75,000-man militia arrived from Massachusetts.

The 1954 United States Capitol shooting was an attack on March 1, 1954, by four Puerto Rican nationalists who sought to promote the cause of Puerto Rico's independence from US rule. They fired 30 rounds from semi-automatic pistols onto the legislative floor from the Ladies' Gallery (a balcony for visitors) of the House of Representatives chamber within the United States Capitol.

The nationalists, identified as Lolita Lebrón, Rafael Cancel Miranda, Andres Figueroa Cordero, and Irvin Flores Rodríguez, unfurled a Puerto Rican flag and began shooting at Representatives in the 83rd Congress, who were debating an immigration bill. Five Representatives were wounded, one seriously, but all recovered. The assailants were arrested, tried and convicted in federal court, and given long sentences, amounting to life imprisonment. In 1978 and 1979, their sentences were commuted by President Jimmy Carter.[2] All four returned to Puerto Rico.

Five congressmen were injured in the attack but none too seriously

Some commentary:

I think part of this is the dichotomy of politicians as symbols and as people. Politicians have power, they are privileged. But they (in theory) have that power because they have been invested with decision making powers by the people.

Like 9/11 wasn't targeting as many killings as possible, they targeted symbols of America, symbols of capitalism and power. This is because arguably barring nukes or similar, no matter how many people you kill in the US, it won't really affect anything. You could kill a hundred thousand people and not much actually happens. The nation will go on.

Bring down the World Trade Centre? Destroy the Pentagon? The White House or Capitol? That has an outsized impact on the nation, because they have an outsized meaning to the nation. Politicians are invested with having a meaning beyond their own life.

So politicians are at once representations of privilege and symbolic concentrations of the common man. So when you attack a politician are you punching up at their power? Or down at the thousands of standard Americans they are the symbolic representation of? Sideways if you are one of said standard Americans?

Other commentary:

B) the intent or the "what if" versus what happened. B) is kind of a complicated one, because I'm expecting responses of "THEY RANSACKED THE CAPITOL!," which is... kind of true, but also they could have done so much worse. I find it hard to get past that: they had every opportunity to do real damage, and yet for the most part, they acted like drunken frat boys. Like all of Trump's presidency, for all the bluster and barking, there was (virtually) no bite. I can even quote Chuck Schumer on that: "all this mob did was delay our vote a few hours." Like Heath Ledger's Joker, they were the dog that caught the car and didn't know what to do with it. I just- that tension bothers me, that so many attitudes seem based in what they could have done instead.

Do we judge people for what they could have done, or for what they did? Judging a mob for failed intent is... dangerous ground to stand on, in my opinion. "Hang Pence" is a clear threat, but a serious one? A whole lot of people make clear yet (supposedly) unserious threats; shall we round them up too?

More radical takes:

Neither the government nor it's public servants are sacred.

I've been hearing a lot of public officials describing capitol hill as "sacred" and the democratic process as "sanctified". Even Joe Biden (a "catholic" no less) described it as such during his speech yesterday.

In no uncertain terms do I reject this framing. Government is transient by definition. Rights are endowed to the individual, not by the state, so there is no real significance in the means by which we govern other than "we like it." If we fail to like it any longer, then we the people have every right to restructure our government in whatever way we please. There is nothing holy or sacred about it.

Private citizens, however, are "sanctified" if you will by "the will of god," endowed "with inalienable rights" by which they "shall not be infringed." Several of the founders questioned whether it was wise to even have a bill of rights, so as to make it appear that we don't have certain, unlisted rights (I think it was the right call, tbh).

So no, I have a pretty particular opinion on the January 6th riots: they were dumb, potentially malicious (but mostly dumb) people who were frustrated by the means of governance. They exercised this frustration by going to the seat of power and expressing it to their leaders with force. This is infinitely more palatable to me since I believe sanctity lies with the private citizen, not the public servant. With this modality, I see the capitol riot as rather benign compared to the 2020 summer riots, pillaging, looting, burning, and destroying my own community because they are acting on private citizens, not public servants.

30 people died during 2020 due to the riots. The same cannot be said of the capitol riots. If the whole government of the United States were overturned that day, I would be more concerned with how they planned on drafting a new constitution than I would be with the public servants caught in a dangerous situation. Public leaders of the country are a lower priority to me than the retention of my rights as a private citizen, and this should be, frankly, how everyone sees it.

It sounds pretty radical now that I type it out, but I stand by it.

More commentary:

If the described plan existed, and was only foiled by Pence not playing along, he would have found out, kept receipts and gone on Oprah as a hero, rather than retiring to a life of obscurity. He may not be a shining intellectual star, but you don't get to be vice president by being blind enough to get played like that.

Also, I'll believe that some kind of plan existed in Trumps delusional inner circle, but a plan that required the direct complicity of the Secret Service and the Capitol Police? With no leaks a year later? Hatched by a president who couldn't conspire to hold on to even his own chief of staff? Helped by Four Seasons-guy?

The biggest argument against most conspiracy theories is that they require a level of competence on part of the perpetrators that very clearly does not exist (and, if it did exist, the conspiracy often wouldn't be necessary -- I mean, 9/11, all that to get to go to war in Iraq, instead of just fabricating and planting convincing evidence of a WMD-program?).

/u/mseebach

america - europe - national identity and divorces

There is no “America”, and anyone who tells you there is hasn’t travelled enough. “American” includes everyone from Eskimos who speak in their native tongue and eat whales and have never seen a two story building to NYC Wall Street billionaires. It is everyone from people living in some of the most rural areas on earth to people living in some of the most densely populated areas on earth. We do not have a common religion and increasingly do not even have a common language (or even expectations of if we should have those things!). Our internal legal structure is de facto vague enough that it led to a massive civil war…that didn’t answer the fundamental problems that led to it. Where does a state’s right end and the federal government’s rights begin? As of 2022 we do not know.

At this point, there’s like 5 Americas at minimum. They very seldom have anything to do with one another, even when acting in good faith. Should food be expensive? In a place like South Dakota, where “food” is made, that question reads like, “should farmers deserve good pay for their hard work”, in places like LA/NYC/DC, that question reads like, “why is everyone trying to bleed me dry financially”. And they are both correct.

Or guns: I’ve been to remote villages in Alaska where you are required to have a gun in your car during Polar Bear Migration Season because hungry polar bears are dangerous and help is far away. I’ve also been to Manhattan, NY where the idea of a very powerful rifle that can shoot 500 meters is (correctly!) seen as absurdly dangerous to the point of bad-faith. And yet firearms are handled nationally. Neither NYC nor Nome, AK is wrong in their understanding, and the system is set up to enrage us.

You have lots of people who favor a “Czech/Slovak” divorce, if in sotto vocce. America is no longer a meaningful unit and the sooner we quit lying to ourselves about it the happier we will be.

Another comment:

Doesn't every country have that same "big city asshole" versus "country bumpkin" dynamic? I'm sure that an educated, white-collar worker in Stockholm, Tokyo, or Shanghai and a rural farmer in the same country hold very different values and priorities, yet they aren't waging a culture war with the same ferociousness we are in America. (Or are they? I'm curious now.)

A reply:

A lot of it is because they can’t. Old school, Melian Dialogue style. “The weak do what they must, the strong do what they will”.

My father is from Belgium, and I have lots of ties to there. I’m not Belgian, but I “get” them. Belgium is really 2 countries uncomfortably glued together. But they realize that even as one country, they’re small and vulnerable. As two, they just wouldn’t matter. At least by grudgingly accepting one another, they can have some safety in the world.

America, on the other hand, is in a similar position. Texas and California do not have alignment of interests. The difference, though, is Texas by itself is the world’s 9th largest economy and California the world’s 5th largest. A Belgian looks across the culture divide differently than a Californian. After all, what would happen if California left? It would have a higher GDP than the United Kingdom, and that’s not exactly a terrible fate.

And a thought about Europe:

The Northern European societies are in a strange suspended anodyne state because they can simply import their culture from America, have no external threats thanks to America and can simply use their existing advantages to stay on top of American controlled world economy, meanwhile giving over their political decision making to the EU. These countries don't have much of an identity crisis because they have long decided a slow comfortable death is the way to go. Their people will only object if the death symptoms become suddenly a bit too difficult to ignore (ie refugee crisis). If you have no ambitions left in life you won't make many enemies

thich nhat hanh died

To live, we must die every instant. We must perish again and again in the storms that make life possible.

not sure if this is exceptional writing, but it got a lot of quotes (so like mine), and temporally relevant.

The feeling began shortly before eleven o’clock at night on October first. I was browsing on the eleventh floor of Butler Library. I knew the library was about to close, and I saw a book that concerned the area of my research. I slid it off the shelf and held it in my two hands. It was large and heavy. I read that it had been published in 1892, and it was donated to the Columbia Library the same year. On the back cover was a slip of paper that recorded the names of borrowers and the dates they took it out of the library. The first time it had been borrowed was in 1915, the second time was in 1932. I would be the third. Can you imagine? I was only the third borrower, on October 1, 1962. For seventy years, only two other people had stood in the same spot I now stood, pulled the book from the shelf, and decided to check it out. I was overcome with the wish to meet those two people. I don’t know why, but I wanted to hug them. But they had vanished, and I, too, will soon disappear. Two points on the same straight line will never meet. I was able to encounter two people in space, but not in time.