Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: How do you guard against ChatGPT use in technical interviews?
5 points by calabin 17 days ago | hide | past | favorite | 30 comments
Yesterday, we did two basic-screen technical interviews where both candidates appeared to use LLMs to generate nearly their entire answers.

We do this quick screen after a 30-min behavioral interview to make sure that candidates can generally operate at the skill level they claim on their resumes.

In the past, we've been shocked by the number of people who will talk a big game, but have really rudimentary programming skills when the rubber meets the road.

The questions are:

1. FizzBuzz

2. Generate the first 20 rows of Pascal's Triangle

3. Drop all non-prime integers from a pre-defined set of 2-to-N

The first candidate we didn't totally suspect, until the second candidate provided nearly letter-for-letter the same answers (same variable names, function names, etc.).

After the interviews, we popped our deck into ChatGPT + Claude and it output exactly what these two candidates had provided.

Last week, a third candidate sent us clearly ChatGPT'd code as an example of some of his work.

I'm unsure what to do here, so I come to you HN to ask, what you have done to guard against the use of LLMs in remote technical interviews? Thanks!

Bonus: The nail in the coffin was when the second candidate immediately clocked the last question as leveraging the Sieve of Eratosthenes. Previously, he'd shown us a pretty impressive portfolio. When asked how he knew the Sieve of Eratosthenes off the top of his head, he claimed he had used it in one of his commercial portfolio projects but couldn't explain how.




Ask them to share their screen, walk you through their code and explain their solution live. Every once in a while ask why they did it in that way and what alternative approaches there could be. Then change the problem a little and ask them to modify their code, again live. Generally, try to go deep and ask follow-up "why" or "how" questions. Those who "don't remember" or only offer vague and shallow answers are likely to have cheated with LLMs (or are just poor candidates).


This is generally what we do and what raised our suspicions in the first place - they both could "walk us through" their code, but had trouble explaining why they did certain things, how they could improve things, etc.

We thought that the approach you've outlined would generally be good enough, and has led us to catch instances where people are leaning heavily on LLMs, but our issue now is that everyone appears to be using these things. Admittedly, our sample size here is low (n=3). But it's still frustrating nonetheless.


You could try to give a challenge that has a few hidden gotchas, and discard candidates that do not spot them. How to do this depends on the role you are hiring for.

For example, in our data scientist interviews we also candidates to analyze datasets with imbalanced classes, outliers, correlated samples, etc. Correctly dealing with these issues requires particular techniques, and most importantly the candidate has to explicitly check whether these issues are present or not. Those who use LLMs mindlessly will not even realize this is the case.


I like this - a huge part of our engineering work is ETL pipelines so giving them some data to process makes things a lot harder to fake.


While I'm personally not keen on LLMs, I admit I do have the co-pilot extension installed in Visual Studio and have been pleasantly surprised at how tab-completion is working. It seems effective for small blocks of code.

So, remembering that I not really a fan when I ask this... but why do you care if a candidate uses an LLM or Google as part of your interview? Do you care if they use an IDE or if they use a code completion plugin? In the end, do you not really want to evaluate if the candidate can produce good clean code?

If you feel like an LLM is too big a crutch, is that because what you wanted to test was memorization of a framework or a test of thought and workflow strategies?

To quote a resource I'm also not keen on but understand why it exists, does your concern about chatGPT during interviews actually point out an XY problem?


The screen we are performing here is a basic "can you program" type of evaluation.

We've run into a number of people with seemingly-decent resumes (several positions as engineers at reputable albeit non-FAANG companies like insurers or e-commerce firms) who have struggled to complete basic tasks like the Pascal's Triangle question mentioned above.

The intent here is to toss them a couple of softballs that they should be able to knock out of the park, almost like if they were helping a younger sibling with CS 1XX or 2XX level work.

We're not against the use of Copilot, etc. once onboarded. We just want to make sure that these candidates possess basic skills that their resumes would suggest they mastered years ago.


Your questions are pointless. I have no interest in CS puzzles at all, I don't memorize or practice them, and that's what you're testing for. I write code to build stuff.

Why not have a conversation with them about things they have built instead? Not form questions that say "tell us about a time you encountered a problem you had to solve" but an actual conversation. Like... an interview.


This screen is after the "walk us through your resume, explain what you've built" conversation, where we ask them about specific challenges, decisions they made, etc.

Everyone who has made it to this screen has already spoken to us about their previous work and at least seemed to have some skill in their general area of programming.

What we're testing for doesn't need to be memorized or practiced. Anyone who can program for money should be able to do things like FizzBuzz, repeat the pattern in Pascal's Triangle with a for loop, or come up with a basic strategy to eliminate non-primes.

Even if you blow the part of the interview where you need to identify prime/not-prime, as long as you show us that you have a process you're fine.

Maybe we just suck at or are too charitable when it comes to processing what candidates have claimed to have done - I'm unsure.


Would you hire someone who can't solve fizz buzz without a LLM? Are we being for real here? Fizz Buzz tests the absolute basics of understanding control flow.


Here is the thing though. Solving "FizzBuzz" is trivial if you happen to know how to do it. Same with "Pascal's Triangle" or many LeetCode style interview questions.

Solving either, particularly the latter when you don't "know" the test and are under pressure is not trivial. I have been in interviews for Ph.D. candidates who when asked simple questions like "Why do you want a Ph.D?" suddenly can't even remember how to speak English let alone explain why they want the advanced degree.

FizzBuzz is kind of a trap. If you don't know it, it seems like it would be easy to try nested if statements as candidates want to write efficient code and can fall into the trap of not wanting to repeat a test for 3 or they try a test for 3 then 5 then 15 before realizing that the easiest solution is three properly ordered if statements and that there is no "elegant" way to solve this trivial problem.

In fact, i work every day with data scientists who write perfect classification code and generate complex statistical models who would likely struggle to write FizzBuzz on the whiteboard in front of the entire team.

I hate interview questions like these and struggle to connect them to performance. If one must test then give a small project (something that might take me 30 mins to an hour) and ask the candidate to complete it at their own pace and publish their work in progress to a git repo. Can they solve the problem (even if they used google and/or an LLM)? Are their commits in meaningful units of work? Can I read their code in a review? Did they ask for proper clarification when uncertain? Do they have tests?

I find this much more salient than FizzBuzz.


The only prior knowledge FizzBuzz requires is the first few weeks of an introduction to programming.


No, I wouldn't waste either of our time asking that question.


Sieve of Erastothenes is often introduced alongside prime numbers in elementary school math class and it has a funny/memorable name - it's not that weird for someone to know it. It is weird to lie about having a practical use case for it since running it to the point of a cryptographically useful prime length is infeasible and it requires O(n) memory.

It seems vanishingly unlikely that this type of question can provide any signal any more outside an in person interview. The incentives are just too strong for candidates and the tools are too good.


Great point on the Sieve - knowing it off of the top of his head wasn't itself a red-handed indicator of cheating, but claiming that he'd used it in a B2B SaaS app was. Especially given that he wasn't able to explain how he'd allegedly used it.

The candidates we're bringing into this screen typically have 1-3 prior positions on their resume, so the point here is to throw them some softballs that they should be able to crank through with some ease to demonstrate that their basic programming skills are there.

We've had experiences where people who have held legitimate programming jobs at F1000 companies struggle greatly with some of the basic questions that I've listed above. I'm not sure why, but it's the case.

We try as best as we can to adjust for anxiousness, I know that programming in front of others can suck, but all the same we're just trying to establish, "before we go forward, can you do some elementary tasks that anyone with your claimed experience should be able to do"

Do you have any suggestions on better questions?


All the questions I come up with that are both narrow enough in scope to reasonably do in an interview and get unhelpful answers from an LLM end up being trick questions for humans too, which is not really fair or productive.

What's particularly frustrating is that I've got some actual unsolved algorithms problems I want a solution to. One might think that since the LLMs are smart enough to re-solve problems from their training data in interviews, they could help - alas, they have been of no help so far.


It's a great question.

What we are doing is to ask the candidate to have hands visible to the camera for the interview. But some systems are working with voice only and this will not be working in those cases.

Probably the best way would be to have the ChatGPT answer beforehand and confront it with what the candidate is saying?


Having the GPT answer beforehand would allow us to confront them about it sooner, but at that point we're definitely not hiring them since they thought it a good idea to use an LLM to cheat on questions they should be able to do in their sleep.

Interesting that you're asking the candidate to have their hands visible. We haven't wanted to have to go that way, but we might.


Ask questions that involve trade-offs, be they design or performance related, and explicitly tell candidates that you're benchmarking against ChatGPT and expect something beyond what an LLM would give, i.e. you're looking more for creative/critical thinking than mere correctness.


This is an interesting direction - especially benchmarking against GPT and telling the candidate we are.

Do you have any suggestions about the type of questions we could be asking here?


This would be like asking a carpenter to build you something without a hammer. At this point we need to realize LLM is a tool like everything else. Maybe give them a LLM challenge - how to do X. What is your prompt. Why?


I disagree with the carpenter-hammer analogy here pretty strongly.

We're basically trying to figure out if they can code generally, or if somehow they've skated by in their last positions without the fundamental skills of programming.

I'm not sure how, but we've come across a number of programmers from F1000 companies that can't seem to hit some of the basics in their chosen language.

LLMs have their place as a tool, but before we empower them with the latest and greatest programming assistance, we want to make sure that they have the skills to do things like critically interpret the output of Copilot, etc.

We want to make sure we're that the people we hire possess the skills they claim, and that they won't serve as a very-slow wrapper for the LLM tools we already pay for.


LLM's are absolutely not a hammer. LLM's are an Ikea shelf in a box. You're not a carpenter because you can assemble one.

We'll leave aside the argument of whether or not a carpenter would rather buy a cheap Ikea shelf than build one in some situations, but it's also applicable to this analogy.


Just ask the candidate whether if they have contributed to relevant large open-source projects in the language you are looking for. If not, then give them a hard leetcode question.

All you have to do for this hard Leetcode puzzle is to ask the candidate to complete it in Rust.

ChatGPT will struggle to help the candidate as it generates garbage.

After they have completed it then for the second technical interview, question the candidate around how they came up with the solution step by step to show if they really understand both the language and algorithm used to solve the puzzle.

This rigorously filters out 95% of frauds and impostors whilst targeting the best and brightest (really).

Job done.


Contribution to OS projects would allow us to bypass this screen altogether, but great point.

Haha we might have to switch to Rust just to ease our interviewing woes here - we're primarily a PHP/Laravel shop, and have tried to be as charitable as possible to candidates by allowing them to program in the language they're strongest in. Perhaps we need to change that.

The screen we're doing is meant to filter out the frauds/imposters and so far has a 100% success rate, but unfortunately we've caught three out of the last three people who have made it to that point. It's become a huge waste of time.

Maybe we just need to curate candidates from OS contributors to prevent this, or something of that nature.


The problem is Leetcode style interviews, not ChatGPT.


We're definitely not doing any "Leetcode style" interviews over here - these are basic "can you perform basic programming tasks as you've stated" type questions.


Do you really need all three of them (FizzBuzz, Pascal's and non-primes) to demonstrate basic programming tasks?


Not really, since failing FizzBuzz basically ends the interview.

That said, Pascal's and Non-Primes allows us to do a few minutes of basic, "that looks great, what are some ideas for optimizing it" work together once they have established that they have the basics of programming.

I am incredibly open to suggestion on better ways to evaluate people for both (a) their basic ability to code, and (b) their ability to think about code/optimization.


TIL Fizz Buzz is "leetcode"


Always has been: leetcode.com/problems/fizz-buzz




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: