Jake Gold
@jacob.gold
did:plc:tpg43qhh4lw4ksiffs4nbda3
10542 followers | 2132 following | 1305 posts
Engineer @ Bluesky Mountain View, CA I like people and other animals, technology, programming, history, gaming, and a lot of other stuff. I probably like you. Views expressed here are my own Email: jake@blueskyweb.xyz / j@jacob.gold
This code seems untouched in 9 years, and it's part of the Seastar framework which is an awesome system.
I'm just disabling hyper threading (SMT), which I expected to do for a better CPU:RAM ratio anyway, so no big deal!
Vodafone ES may be blocking our network provider (OVH). I'm not sure if you can unsubscribe from whatever website blocking features they have, but that may be the problem.
That is odd. Are you on a network that is using a firewall like "Pi Hole" or other "Content Filtering" devices? Some corporate networks have these.
If you're using a phone on wifi, can you try using cellular internet (turn off wifi)?
giving me nightmares about the 1.5 TB nodes I still want to memtest...
So much for MAC address based file names!
(one reason syslinux/pxelinux probably uses hyphens for MAC-based configs?)
Very cool!
* replication factor
These are our most expensive configs at xx,xxx/ea.
These configs can't be rented, but even if they could be, would be 10x or more the cost per year.
Challenge is creating capacity to scale to large numbers on a (relatively) small budget. If you run the numbers, buying is the only viable option.
Yeah, I like reading about this stuff too. Will try to post more details as well, mostly after we've finished this chunk of work since it's still changing and we're just focused on getting it done.
The way we're doing the AppView v2 and Relay v2, they're essentially a combined service. This is one of the ScyllaDB cluster nodes they're using.
And you tell them to get some more IOPS and call you in the morning?
lol good spotting. Filtered from the input by a grep since theyβre the two 3.8 TB OS NVMe drives.
Something like that because the storage gets chopped in half and then chopped in thirds and then chopped in half again.
Due to RAID, replicator factor, and compaction temporary space.
And the data gets multiplied due to the denormialization required to enable high performance querying.
And this is what you want to see on your new Linux database nodes' `lsblk` output.
It's a Mellanox ConnectX-6 Dx (dual 100G) and haven't done serious optimizations yet, nothing like offloading TLS.
It's for the new v2 system, not the current user levels.
Yeah, even open source projects like "Linux" are trademarked for good reasons. It's the only way to prevent bad actors from tricking users, etc.
The Linux Foundation has a Trademark Policy doc, and that's probably the solution here as well.
CC @emily.bsky.team @rose.bsky.team
Fixing the big bugs by creating the little bugs. It makes sense if you don't think about it.
I'm way too full, which is how I judge the success of Thanksgiving.
Feeling very grateful to be working on a project that has the opportunity to "push the human race forward" by fixing the bugs in social media.
And to be doing it with a great team, that makes success likely and the work enjoyable.
I've found a couple of Yubikeys are well worth the peace of mind. I've washed my primary one at least five times now without any problem.
Sorry about that, we'll look into it. Thanks for reporting the issue.
Very likely that any issue like that is a temporary problem of some kind, but it obviously shouldn't happen and we'll try to prevent it.
Looks like it's configured to network boot, which is a good way to do things in these kinds of systems. My guess would be that the network boot server is down or there's a networking issue (someone unplugged it lol)
βοΈ Settings -> "Sign out"
Fortunately, it's a very simple HTTP load balancer for just a couple of in-house services. So observability is much easier than a generic router with a complicated mix of customers/services behind it, like at an ISP.
It's not for me personally π
This is for a server that is designed to be able to push petabytes per month of atproto (and Bluesky) data to/from PDS hosts, Feed Generators, etc.
This is a load balancer. The NIC is of course doing a lot of the hard work and that's all ASICs. But then just a lot of CPU threads (~1 per Gbit/s) on a modern PCI bus.
Pretty amazing how powerful the most recent generations of hardware are!
You'll be streaming from it soon enough π
(especially when you want to deliver complete copies of the entire atproto network to anyone who wants it, for free)
that is a good evening
Welcome to Bluesky. Big fan!
Please! I'll ask you for a bit time after Thanksgiving week, and then whenever works for you. π
Welcome back Brad! Good timing. Things are heating up and open network is coming soon!
Lots of super fun things to work on!
What about a "raw" mode where a cardyb endpoint just extracts all the Open Graph key:values? That'd solve your request, right?
Probably an app change as well, but what's the idea?
YouTube's value is:
<meta name="twitter:card" content="player">
What would we do w/that?
Yup! This is the easy scaling issue: unauthenticated users will get a cached response that isn't dynamically personalized for them. But the TTL will be low so it's still pretty real-timey.
Personally, I like to just say that Bluesky is designed to be an open network as in "the web is an open network"
in other words:
This issue was cause by our upstream provider for this host having a DNS outage. No status page update either, of course, but not hard to workaround which we have done.
Yes, we're fixing it now (and status page has just been updated).
We're determined to have a more-accurate (automated) status page eventually.
Every status page is a lie, it's just a matter of degrees.
Impressive as hell!
Hot on the heels of last week's Bridgy Fed progress, more to report this week: first pass at Bluesky support is feature complete! All basic interactions - user discovery, following, posts, replies, likes, reposts - are working, both directions. π
^ I guess I also need autosuggest on my hashtags so I can avoid typos
#IfItsNotBoiledItsNotABagle is the main reason I need hashtags.
Just in case you wanted an extra one!
(really just because we didn't filter!)
Thank you, John!
Alexander (2004) which is a very flawed movie but has some excellent aspects to it, like the really well done battle scenes and some of the sets.
Can't come soon enough!
FYI: this is the closest to "documentation" so far around this change. More will come soon, I'm sure.
Sorry for the confusion. Auth requests work on bsky.social or directly.
Because auth requests sent to the *.host.bsky.network PDS hosts will get proxied to bsky.social
(This needs to be documented properly, but none of this is required. Sending all requests to bsky.social is totally fine for now.)
Actually no this is just normal PDS stuff with some new auth stuff designed to make it easy to run many PDS hosts near users (in the near future) but make it feel like a single PDS.
But it does improve performance by reducing latency and hops. And eventually something like this may be recommended as part of the protocol (I'm not sure).
Thanks! You didn't miss it, this aspect of talking to bsky.social for auth and talking to the PDS directly for other requests is a new thing and it's not required for anyone to implement it.
It's totally acceptable to send all requests to bsky.social and let it proxy requests to the user's PDS.
Yes, that's auth related. There needs to be some info published on this, which may not have happened yet!
Hmm, no, you should be able to send non-auth related requests directly to the PDS.
Maybe @divy.zone can help!
Most of the backend and frontend is TypeScript. Some of the infra and backend is powered by Go and more of it will be soon.
Nearly everything is open source:
github.com/bluesky-soci...
github.com/bluesky-soci...
github.com/bluesky-soci...
Yes and yes!
Search works across all PDS hosts, it doesn't matter which server you're on. All the data gets sent to the Search service by the Relay service, which gets it from every server.
The official app making you post "Paul is the GOAT" is the single most important reason why third-party clients are important.
Even in the case of an authoritative lookup, it's going to be once every 24 hour or something, well beyond the TTL because there is a different "TTL" for handle verification unrelated to DNS.
Oh yes, we'll get that at some point I'm sure, but it's not there yet. Someone could create a tool or app for Bluesky that allows this, it's totally possible, but it's not in the official app yet.
3/ Or do you mean something else?
2/ Like this?
1/ You can just reply to your own post?
Yup, that's right!
We may do authoritative lookups in some cases, the debug page does. But it's very few requests (like one) so their system is pretty aggressive if that triggers a response.
We'l be using tiny little SQLite databases (one for each user) and (relatively) giant SycllaDB clusters (for aggregation). No joke π
Yeah, I saw that, but it's actually pretty much impossible to do what we're doing with Postgres efficiently (and cost effectively).
Yeah, ScyllaDB is basically a C++ rewrite of Cassandra with many operational improvements. I respect Cassandra but wouldn't run it myself.
lol!
Your post is actually stored (canonically) in your own individual SQLite database as of this week!
We're in the process of moving our v1 Postgres-based backend to our v2 ScyllaDB-based backend.
Yeah, the afraid dot org service seems pretty unreliable, either because some of the NS are down/don't response, or because of anti-DDoS measures.
Pretty sure that's the entire problem.
lol (wasn't me!)
We'll look into it. Thanks for the report.
That is unusual. All systems are operating well. Ensure you're not being blocked by a VPN or ad blocker or similar.
But we'll look into it. Thanks for the report.
Might have just been a temporary issue?
Try refreshing or signing out and signing in? If it did, it would be a major software bug and we haven't seen that happen
And if it did happen, we could almost certainly fix it.
That's what the sandbox and staging environments are for!
What's the handle, I can check it from other places? Maybe it was a temporary problem with a nameserver?
Look, I know itβs not perfect by any stretch but the fact is that even after seven months, Iβm eager to pull up Bluesky to see whatβs going on around here, and how many social media sites can you say that about these days. It makes me happy, not angry or sad.
One of the things about a decentralized network is a lot of the numbers aren't exact. It's probably not counting deleted users correctly, which is fine, but a bit imprecise.
Well 2 months ago, but yeah, wild!
Jaz is being more accurate (taking into account banned/taken down accounts, etc.) π
I did try that and didn't work for me.
That is a bit odd. They may have implemented more aggressive blocking to prevent content scraping. I doubt they intend to block social cards as they exist here.
Most servers aren't this aggressive and work fine.
Unfortunately, Etsy is blocking requests for the necessary metadata π
Twitter historically did not have an edit function. But it will be possible on Bluesky, just hasn't been implemented yet.
You signed in directly by entering the PDS? Which client?
...and my picnic lunch for the day has been decided! π
Yeah, point taken, but there would probably still be cases where the entire domain hasn't been trusted.
It's common to have CDNs and other things like this on alternative domains for various reasons.
And fortunately, this is the last domain to worry about for the foreseeable future.