Hacker Newsnew | past | comments | ask | show | jobs | submit | more sa46's commentslogin

The arena experiment is on indefinite hold:

> Note, 2023-01-17. This proposal is on hold indefinitely due to serious API concerns.

https://github.com/golang/go/issues/51317

Potential successor: https://github.com/golang/go/discussions/70257


>Note that the best-case scenario is the elimination of the overheads above to 0, which is at most ~10% in these particular benchmarks. Thus, it's helpful to consider the proportion of GC overhead eliminated relative to that 10% (so, 7% reduction means 70% GC overhead reduction).

Wow. amazing to see of off-heap allocation can be that good

https://go.googlesource.com/proposal/+/refs/heads/master/des...


Meanwhile Java and .NET have had off-heap and arenas for a while now.

Which goes to show how Go could be much better, if being designed with the learnings of others taken into account.

The adoption of runtime.KeepAlive() [0], and the related runtime.AddCleanup() as replacement for finalizers are also learnings from other languages [1].

[0] - https://learn.microsoft.com/en-us/dotnet/api/system.gc.keepa...

[1] - https://openjdk.org/jeps/421


What a coincedence ! :)

Recently used MemorySegment in Java, it is extremely good. Just yesterday i implemented Map and List interface using MemorySegment as backing store for batch operations instead of using OpenHFT stuff.

Tried -XX:TLABSize before but wasnt getting the deserved performance.

Not sure about .NET though, havent used since last decade.


> Page Object Models trade off clarity for encapsulation [and] obscure what's actually happening.

This argument also applies to using a function for abstraction.

I've just written a few dozen e2e tests with Playwright. The code looks like:

    await invoiceDetailPage.selectCustomer(page, 'Acme');
    await invoiceDetailPage.selectCustomerPoNumber(page, '1234567890');
    await invoiceDetailPage.setCustomerReleaseNumber(page, '987654321');
    ...10-15 more lines
Each of those lines is 3 to 20 lines of Playwright code. Aggressive DRY is bad, but Page Object Models are usually worth it to reduce duplication and limit churn from UI changes.


Isn't gettimeofday implemented with vDSO to avoid kernel context switching (and therefore, most of the overhead)?

My understanding is that using tsc directly is tricky. The rate might not be constant, and the rate differs across cores. [1]

[1]: https://www.pingcap.com/blog/how-we-trace-a-kv-database-with...


I think most current systems have invariant tsc, I skimmed your article and was surprised to see an offset (but not totally shocked), but the rate looked the same.

You could cpu pin the thread that's reading the tsc, except you can't pin threads in OpenBSD :p


But just to be clear (for others), you don't need to do that because using RDTSC/RDTSCP is exactly how gettimeofday and clock_gettime work these days, even on OpenBSD. Where using the TSC is practical and reliable, the optimization is already there.

OpenBSD actually only implemented this optimization relatively recently. Though most TSCs will be invariant, they still need to be trained across cores, and there are other minutiae (sleeping states?) that made it a PITA to implement in a reliable way, and OpenBSD doesn't have as much manpower as Linux. Some of those non-obvious issues would be relevant to someone trying to do this manually, unless they could rely on their specific hardware behavior.


Out of interest, does training across cores result in any residual offset? If so, is the offset nondeterministic?


I was curious myself, poked around, and found some references. But I'm still woefully incapable of answering that with any confidence and don't want to risk saying anything misleading, so here's the code and some other breadcrumbs:

1. Apparently OpenBSD gave up on trying to fix desync'd TSCs. See https://github.com/openbsd/src/commit/78156938567f79506a923c...

2. Relevant OpenBSD kernel code: https://github.com/openbsd/src/blob/master/sys/arch/amd64/am...

3. Relevant Linux kernel code: https://github.com/torvalds/linux/blob/master/arch/x86/kerne..., https://github.com/torvalds/linux/blob/master/arch/x86/kerne...

4. Linux kernel doc (out-of-date?): https://www.kernel.org/doc/Documentation/virtual/kvm/timekee...

5. Detailed SUSE blog post with many links: https://www.suse.com/c/cpu-isolation-nohz_full-troubleshooti...

6. Linux patch (uncommitted?) to attempt to directly sync TSCs: https://lkml.rescloud.iu.edu/2208.1/00313.html


Wizardly workarounds for broken APIs persist long after those APIs are fixed. People still avoid things like flock(2) because at one time NFS didn't handle file locking well. CLOCK_MONOTONIC_RAW is fine these days with the vDSO.


Sadly GPFS still doesn’t support flock(2), so I still avoid it.


Doesn't it? https://sambaxp.org/archive-data-samba/sxp09/SambaXP2009-DAT...

It would be weird, even for AIX, to support POSIX byte range locks and not the much simpler flock.


It doesn't, at least on the version I have access to, as it is configured on that cluster.

I’m using Linux rather than AIX.

fcntl(2) locks are supported (as long as they aren't OFD), but flock(2) locks don't work across nodes.


It was a while ago (2009-10ish) but I ran into an exceptionally interesting performance issue that was partly identified with RDTSC. For a course project in grad school I was measuring the effects of the Python GIL when running multi-threaded Python code on multi-core processors. I expected the overhead/lock contention to get worse as I added threads/cores but the performance fell off a cliff in a way that I hadn't expected. Great outcome for a course project, it made the presentation way more interesting.

The issue ended up being that my multi-threaded code when running on a single core pinned that core at 100% CPU usage, as expected, but when running it across 4 cores it was running 4 cores at 25% usage each. This resulted in the clock governor turning down the frequency on the cores from ~2GHz to 900MHz and causing the execution speed to drop even worse than just the expected lock contention. It was a fun mystery to dig into for a while.


If you have something newer than a pentium 4 the rate will be constant.

I'm not sure of the details for when cores end up with different numbers.


TSC is about cycles consumed by a core. Not about actual time. And so for microbenchmarking, it actually makes sense, because you are often much more interested in CPU benchmarks than network benchmarks in microbenchmarking.


You have to benchmark tsc against a fixed CPU speed, say 1000Mhz, then you have a reliable comparison.


> US veterans have to seek permission

Retired military personnel, not all veterans.


This is a very off topic tangent, but that makes me quite curious.

What is the difference between "veterans" and "retired military personnel"?


“Retired military personnel” have completed their 20+ years of service and retired with full pension. “Veteran” refers to anyone who has served in the armed forces.


So if you only did 10 years and have military skills this regulation does not apply?


Ahhh, right. I guess that'a a distinction that makes sense. Thanks.


I'm not sure that it is a distinction with a difference in this specific case, because to my reading, the only folks who might not be covered publicly were those who were not officially, formally, regularly, or directly employed by military agencies, while doing the work alongside those who were so employed. Contractors, for example, may not be bound by the clause if they were not previously a reservist, a civilian DoD employee, an enlisted solider, or an officer in the armed forces. I am narrowly reading this to steelman their position, and it seems there might be some narrow wiggle room there, but I'm not sure if that's what they meant of if they're quibbling simply to have something to say. They might be technically right though, you be the judge:

https://dodsoco.ogc.osd.mil/Portals/102/emoluments_clause_ap... | https://web.archive.org/web/20250422185437/https://dodsoco.o...

> WHITE PAPER

> APPLICATION OF THE EMOLUMENTS CLAUSE TO DoD CIVILIAN EMPLOYEES AND MILITARY PERSONNEL

[The following paragraph is from the conclusion, and I think this might be Justice Department interpretations, as I don't think these issues have been tested before the Supreme Court. I am not a lawyer, nor do I speak for the military or Justice Department.]

> The Emoluments Clause to the Constitution applies to all Federal personnel. The Clause prohibits receipt of foreign gifts unless Congress consents such as in the Foreign Gifts and Decorations Act, 5 U.S.C. § 7342. For retired military personnel, the Emoluments Clause continues to apply to them because they are subject to recall. The Justice Department opinions referred to in this paper construe the Emoluments Clause broadly. Specifically, the Justice Department construes the Clause to include not only gifts of travel and food, but also payments such as proportionate profit-sharing. To avoid an Emoluments Clause problem resulting in suspension of retired pay, retired military personnel should seek advance consent through their respective Service consistent with 37 U.S.C. § 908. It is prudent for retired military personnel to obtain advance approval even when there is uncertainty about the Clause’s applicability.

Perhaps there's some nuanced reading of "veterans" that includes folks who aren't armed services, although I think they would likely still fall under the purview of this clause, though I am curious about the factors at play here.

Edit: I think that if you are retired and fail to comply to the Gov's liking, all foreign payments are able to be counted against any military pension you may receive. I am less certain about how non-officers who have no pension are treated, or if they are still beholden to the clause after leaving the armed forces.

Here is additional material from the Commissioned Corps Personnel Manual:

https://dcp.psc.gov/ccmis/ccis/documents/CCPM26_9_1.pdf | https://web.archive.org/web/20250529163709/https://dcp.psc.g...

Found this slideshow that has this test:

https://www.oge.gov/web/OGE.nsf/0/A7C0E4D79F3F6D07852585B600... | https://web.archive.org/web/20250505113229/https://www.oge.g...

> 4-Part test to Determine if the Emoluments Clause Does Not Apply:

> 1. U.S. cannot be a member of a foreign state

> 2. Organization must carry out U.S. foreign policy

> 3. U.S. participates in governance of organization

> 4. Congress approved participation, no concern about divided loyalty


So you think that if you are a pilot that left after 19 years you can go and train Chinese pilots without permission?

Veterans is just another word for retired military personnel. If you were in the military and are not dishonorably discharged you are a veteran. Whether you do 2,3 or 20 years.

I am pretty sure the rule though applies to all regardless of discharge status.


> Veterans is just another word for retired military personnel.

A sergeant who leaves after a three-year enlistment is a veteran, but not a retiree.

The distinction matters because military retirees retain some privileges from their service, most importantly, a pension. Those privileges mean retirees fall under the emoluments clause.

However, a veteran not receiving retired pay is not subject to the emoluments clause as they have no relationship with the federal government. The Congressional Research Service states:

> Former servicemembers with no military status and not entitled to military retired pay can perform [foreign military service] on the same basis as a U.S. national who never served in the armed services. [1]

Interestingly, this implies a retiree could forfeit their retired pay to avoid being subject to the emoluments clause.

[1]: https://www.congress.gov/crs-product/IF12068


> This uses global state under the hood.

Looks safe to me. It uses `crypto/rand.Read` which is declared as safe for concurrent use. The cache is accessed via sync.Pool which is thread safe. As a check, I ran the tests with `-race` and it passed.


Funny timing—I tried optimizing the Otel Go SDK a few weeks ago (https://github.com/open-telemetry/opentelemetry-go/issues/67...).

I suspect you could make the tracing SDK 2x faster with some cleverness. The main tricks are:

- Use a faster time.Now(). Go does a fair bit of work to convert to the Go epoch.

- Use atomics instead of a mutex. I sent a PR, but the reviewer caught correctness issues. Atomics are subtle and tricky.

- Directly marshal protos instead of reflection with a hand-rolled library or with https://github.com/VictoriaMetrics/easyproto.

The gold standard is how TiDB implemented tracing (https://www.pingcap.com/blog/how-we-trace-a-kv-database-with...). Since Go purposefully (and reasonably) doesn't currently provide a comparable abstraction for thread-local storage, we can't implement similar tricks like special-casing when a trace is modified on a single thread.


Would the sync.Pool trick mentionned here: https://hypermode.com/blog/introducing-ristretto-high-perf-g... help ? It’s lossy but might be a good compromise.


It might be. I've seen the trick pop up a few times:

1. https://puzpuzpuz.dev/thread-local-state-in-go-huh

2. https://victoriametrics.com/blog/go-sync-pool/

It's probably too complex for the Otel SDK, but I might give it a spin in my experimental tracing repo.


There is an effort to use arrow format for metrics too - https://github.com/open-telemetry/otel-arrow - but no client that exports directly to it yet.


Other fun fact, Army Rangers trace their lineage to Rogers’ Rangers. Rogers fought for the crown in the Revolutionary War.

https://en.m.wikipedia.org/wiki/Rogers'_Rangers


> The convenience of writing `?` means nobody will bother wrapping errors anymore.

A thread from two days ago bemoans this point:

https://news.ycombinator.com/item?id=44149809


“Simply declaring” is inaccurate description of the Go team’s decision. The team built several proposals, reviewed dozens more, and refined the process by gathering user feedback in multiple channels.

The Go team thoroughly explored the design space for seven years and did not find community consensus.


There are two possibilities.

1) There isn't consensus that improved syntax for error handling is needed in the first place. If that is the case, they should just say so, instead of obfuscating by focusing on the number of proposals and the length of the process.

2) There is consensus about a need for improved error handling syntax, but after seven years of proposals they haven't been able to find community consensus about the best way to add said syntax. That would mean that improved syntax for error handling is necessary, but the Go team is understandably hesitant to push forward and lock in a potentially inferior solution. If that is the case, then would be reason to continue working on improved syntax for error handling, so as to find the best solution even if it takes a while.


I helped with the initial assessment for a migration from Postgres with Citus to SingleStore.

https://www.singlestore.com/made-on/heap/


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: