Ask HN: How do you guard against ChatGPT use in technical interviews?

blackbear_ · 2025-01-14T15:18:13 1736867893

Ask them to share their screen, walk you through their code and explain their solution live. Every once in a while ask why they did it in that way and what alternative approaches there could be. Then change the problem a little and ask them to modify their code, again live. Generally, try to go deep and ask follow-up "why" or "how" questions. Those who "don't remember" or only offer vague and shallow answers are likely to have cheated with LLMs (or are just poor candidates).

calabin · 2025-01-14T15:40:09 1736869209

This is generally what we do and what raised our suspicions in the first place - they both could "walk us through" their code, but had trouble explaining why they did certain things, how they could improve things, etc.

We thought that the approach you've outlined would generally be good enough, and has led us to catch instances where people are leaning heavily on LLMs, but our issue now is that everyone appears to be using these things. Admittedly, our sample size here is low (n=3). But it's still frustrating nonetheless.

blackbear_ · 2025-01-14T15:59:22 1736870362

You could try to give a challenge that has a few hidden gotchas, and discard candidates that do not spot them. How to do this depends on the role you are hiring for.

For example, in our data scientist interviews we also candidates to analyze datasets with imbalanced classes, outliers, correlated samples, etc. Correctly dealing with these issues requires particular techniques, and most importantly the candidate has to explicitly check whether these issues are present or not. Those who use LLMs mindlessly will not even realize this is the case.

calabin · 2025-01-14T16:14:37 1736871277

I like this - a huge part of our engineering work is ETL pipelines so giving them some data to process makes things a lot harder to fake.

uberman · 2025-01-14T15:00:46 1736866846

While I'm personally not keen on LLMs, I admit I do have the co-pilot extension installed in Visual Studio and have been pleasantly surprised at how tab-completion is working. It seems effective for small blocks of code.

So, remembering that I not really a fan when I ask this... but why do you care if a candidate uses an LLM or Google as part of your interview? Do you care if they use an IDE or if they use a code completion plugin? In the end, do you not really want to evaluate if the candidate can produce good clean code?

If you feel like an LLM is too big a crutch, is that because what you wanted to test was memorization of a framework or a test of thought and workflow strategies?

To quote a resource I'm also not keen on but understand why it exists, does your concern about chatGPT during interviews actually point out an XY problem?

calabin · 2025-01-14T15:46:09 1736869569

The screen we are performing here is a basic "can you program" type of evaluation.

We've run into a number of people with seemingly-decent resumes (several positions as engineers at reputable albeit non-FAANG companies like insurers or e-commerce firms) who have struggled to complete basic tasks like the Pascal's Triangle question mentioned above.

The intent here is to toss them a couple of softballs that they should be able to knock out of the park, almost like if they were helping a younger sibling with CS 1XX or 2XX level work.

We're not against the use of Copilot, etc. once onboarded. We just want to make sure that these candidates possess basic skills that their resumes would suggest they mastered years ago.

TheMongoose · 2025-01-14T15:59:40 1736870380

Your questions are pointless. I have no interest in CS puzzles at all, I don't memorize or practice them, and that's what you're testing for. I write code to build stuff.

Why not have a conversation with them about things they have built instead? Not form questions that say "tell us about a time you encountered a problem you had to solve" but an actual conversation. Like... an interview.

calabin · 2025-01-14T16:11:57 1736871117

This screen is after the "walk us through your resume, explain what you've built" conversation, where we ask them about specific challenges, decisions they made, etc.

Everyone who has made it to this screen has already spoken to us about their previous work and at least seemed to have some skill in their general area of programming.

What we're testing for doesn't need to be memorized or practiced. Anyone who can program for money should be able to do things like FizzBuzz, repeat the pattern in Pascal's Triangle with a for loop, or come up with a basic strategy to eliminate non-primes.

Even if you blow the part of the interview where you need to identify prime/not-prime, as long as you show us that you have a process you're fine.

Maybe we just suck at or are too charitable when it comes to processing what candidates have claimed to have done - I'm unsure.

jrgilman · 2025-01-14T17:00:59 1736874059

Would you hire someone who can't solve fizz buzz without a LLM? Are we being for real here? Fizz Buzz tests the absolute basics of understanding control flow.

uberman · 2025-01-14T21:38:53 1736890733

Here is the thing though. Solving "FizzBuzz" is trivial if you happen to know how to do it. Same with "Pascal's Triangle" or many LeetCode style interview questions.

Solving either, particularly the latter when you don't "know" the test and are under pressure is not trivial. I have been in interviews for Ph.D. candidates who when asked simple questions like "Why do you want a Ph.D?" suddenly can't even remember how to speak English let alone explain why they want the advanced degree.

FizzBuzz is kind of a trap. If you don't know it, it seems like it would be easy to try nested if statements as candidates want to write efficient code and can fall into the trap of not wanting to repeat a test for 3 or they try a test for 3 then 5 then 15 before realizing that the easiest solution is three properly ordered if statements and that there is no "elegant" way to solve this trivial problem.

In fact, i work every day with data scientists who write perfect classification code and generate complex statistical models who would likely struggle to write FizzBuzz on the whiteboard in front of the entire team.

I hate interview questions like these and struggle to connect them to performance. If one must test then give a small project (something that might take me 30 mins to an hour) and ask the candidate to complete it at their own pace and publish their work in progress to a git repo. Can they solve the problem (even if they used google and/or an LLM)? Are their commits in meaningful units of work? Can I read their code in a review? Did they ask for proper clarification when uncertain? Do they have tests?

I find this much more salient than FizzBuzz.

sn9 · 2025-01-14T23:22:44 1736896964

The only prior knowledge FizzBuzz requires is the first few weeks of an introduction to programming.

TheMongoose · 2025-01-14T17:53:36 1736877216

No, I wouldn't waste either of our time asking that question.

charleslmunger · 2025-01-14T15:29:58 1736868598

Sieve of Erastothenes is often introduced alongside prime numbers in elementary school math class and it has a funny/memorable name - it's not that weird for someone to know it. It is weird to lie about having a practical use case for it since running it to the point of a cryptographically useful prime length is infeasible and it requires O(n) memory.

It seems vanishingly unlikely that this type of question can provide any signal any more outside an in person interview. The incentives are just too strong for candidates and the tools are too good.

calabin · 2025-01-14T15:54:47 1736870087

Great point on the Sieve - knowing it off of the top of his head wasn't itself a red-handed indicator of cheating, but claiming that he'd used it in a B2B SaaS app was. Especially given that he wasn't able to explain how he'd allegedly used it.

The candidates we're bringing into this screen typically have 1-3 prior positions on their resume, so the point here is to throw them some softballs that they should be able to crank through with some ease to demonstrate that their basic programming skills are there.

We've had experiences where people who have held legitimate programming jobs at F1000 companies struggle greatly with some of the basic questions that I've listed above. I'm not sure why, but it's the case.

We try as best as we can to adjust for anxiousness, I know that programming in front of others can suck, but all the same we're just trying to establish, "before we go forward, can you do some elementary tasks that anyone with your claimed experience should be able to do"

Do you have any suggestions on better questions?

charleslmunger · 2025-01-18T08:43:48 1737189828

All the questions I come up with that are both narrow enough in scope to reasonably do in an interview and get unhelpful answers from an LLM end up being trick questions for humans too, which is not really fair or productive.

What's particularly frustrating is that I've got some actual unsolved algorithms problems I want a solution to. One might think that since the LLMs are smart enough to re-solve problems from their training data in interviews, they could help - alas, they have been of no help so far.

sandropuppo · 2025-01-14T15:03:51 1736867031

It's a great question.

What we are doing is to ask the candidate to have hands visible to the camera for the interview. But some systems are working with voice only and this will not be working in those cases.

Probably the best way would be to have the ChatGPT answer beforehand and confront it with what the candidate is saying?

calabin · 2025-01-14T16:04:02 1736870642

Having the GPT answer beforehand would allow us to confront them about it sooner, but at that point we're definitely not hiring them since they thought it a good idea to use an LLM to cheat on questions they should be able to do in their sleep.

Interesting that you're asking the candidate to have their hands visible. We haven't wanted to have to go that way, but we might.

maxwell · 2025-01-14T15:14:53 1736867693

Ask questions that involve trade-offs, be they design or performance related, and explicitly tell candidates that you're benchmarking against ChatGPT and expect something beyond what an LLM would give, i.e. you're looking more for creative/critical thinking than mere correctness.

calabin · 2025-01-14T16:02:37 1736870557

This is an interesting direction - especially benchmarking against GPT and telling the candidate we are.

Do you have any suggestions about the type of questions we could be asking here?

thecrumb · 2025-01-14T15:13:17 1736867597

This would be like asking a carpenter to build you something without a hammer. At this point we need to realize LLM is a tool like everything else. Maybe give them a LLM challenge - how to do X. What is your prompt. Why?

calabin · 2025-01-14T16:01:40 1736870500

I disagree with the carpenter-hammer analogy here pretty strongly.

We're basically trying to figure out if they can code generally, or if somehow they've skated by in their last positions without the fundamental skills of programming.

I'm not sure how, but we've come across a number of programmers from F1000 companies that can't seem to hit some of the basics in their chosen language.

LLMs have their place as a tool, but before we empower them with the latest and greatest programming assistance, we want to make sure that they have the skills to do things like critically interpret the output of Copilot, etc.

We want to make sure we're that the people we hire possess the skills they claim, and that they won't serve as a very-slow wrapper for the LLM tools we already pay for.

TheMongoose · 2025-01-14T16:05:45 1736870745

LLM's are absolutely not a hammer. LLM's are an Ikea shelf in a box. You're not a carpenter because you can assemble one.

We'll leave aside the argument of whether or not a carpenter would rather buy a cheap Ikea shelf than build one in some situations, but it's also applicable to this analogy.

rvz · 2025-01-14T15:23:35 1736868215

Just ask the candidate whether if they have contributed to relevant large open-source projects in the language you are looking for. If not, then give them a hard leetcode question.

All you have to do for this hard Leetcode puzzle is to ask the candidate to complete it in Rust.

ChatGPT will struggle to help the candidate as it generates garbage.

After they have completed it then for the second technical interview, question the candidate around how they came up with the solution step by step to show if they really understand both the language and algorithm used to solve the puzzle.

This rigorously filters out 95% of frauds and impostors whilst targeting the best and brightest (really).

Job done.

calabin · 2025-01-14T16:07:07 1736870827

Contribution to OS projects would allow us to bypass this screen altogether, but great point.

Haha we might have to switch to Rust just to ease our interviewing woes here - we're primarily a PHP/Laravel shop, and have tried to be as charitable as possible to candidates by allowing them to program in the language they're strongest in. Perhaps we need to change that.

The screen we're doing is meant to filter out the frauds/imposters and so far has a 100% success rate, but unfortunately we've caught three out of the last three people who have made it to that point. It's become a huge waste of time.

Maybe we just need to curate candidates from OS contributors to prevent this, or something of that nature.

grajaganDev · 2025-01-14T15:17:47 1736867867

The problem is Leetcode style interviews, not ChatGPT.

calabin · 2025-01-14T15:35:28 1736868928

We're definitely not doing any "Leetcode style" interviews over here - these are basic "can you perform basic programming tasks as you've stated" type questions.

grajaganDev · 2025-01-14T15:48:17 1736869697

Do you really need all three of them (FizzBuzz, Pascal's and non-primes) to demonstrate basic programming tasks?

calabin · 2025-01-14T15:57:28 1736870248

Not really, since failing FizzBuzz basically ends the interview.

That said, Pascal's and Non-Primes allows us to do a few minutes of basic, "that looks great, what are some ideas for optimizing it" work together once they have established that they have the basics of programming.

I am incredibly open to suggestion on better ways to evaluate people for both (a) their basic ability to code, and (b) their ability to think about code/optimization.

jrgilman · 2025-01-14T17:01:16 1736874076

TIL Fizz Buzz is "leetcode"

grajaganDev · 2025-01-14T17:37:10 1736876230

Always has been: leetcode.com/problems/fizz-buzz