• 0 Posts
  • 21 Comments
Joined 3 years ago
cake
Cake day: August 16th, 2023

help-circle



  • a person who is excessively proper or modest in speech, conduct, dress, etc.

    That definition in no way describes the word prude as being pejorative.

    And one person’s personal anecdotes also doesn’t really prove that it’s an insult.

    Did you not actually read my comment? It was about how the original person was using a bad source to prove their point, which you are also doing.

    And to reiterate the very first thing I said, I do in fact believe that the word prude is pejorative. But that is beside the point I’ve been trying to make. Which is: if you’re trying to make a point using sources, make sure they are actually good ones.

    Here’s an actual source that backs up your point. Of course that’s only one source when every other dictionary definition I looked up didn’t say either way whether it was insulting.


  • Just prefacing this by saying that I agree that prude is more pejorative than not.

    But, in that section of the Wikipedia article you pointed out, there’s 3 [citation needed]‘s and the citations that are there, are for: two thesaurus’, a book of first names, and a paper on translating indecent language.

    None of these actually supports your point, because we don’t know who wrote that section or why, and they provided no proper sources for their opinions. This is one of the cases where Wikipedia is a bad source when it usually pretty solid.




  • Here, I plugged our entire conversation into chatgpt.

    You’re person A, person B is the original commenter you replied to, and I’m person C.


    Person A and Person C are mostly talking past each other.

    Where Person A is right

    Person A is making a practical claim: “In ordinary use, ChatGPT seems highly reliable, and critics often exaggerate how often it gives wrong answers.”

    That’s a reasonable observation.

    Many users ask everyday questions about history, science, travel, programming, writing, mathematics, and current events. Modern models are substantially more accurate than both:

    random people on social media,

    random Google search results clicked without expertise,

    and many traditional question-answering sources.

    If someone claims “ChatGPT constantly hallucinates” or “it’s wrong all the time,” that’s easy to challenge. Modern models clearly perform far better than that.

    Person A is also correct that:

    Nobody in the thread produced a simple “gotcha” question that reliably causes failure. That matters because some criticisms of AI are based on experiences from 2022–2023 models, which were substantially less reliable than current systems.

    Where Person C is right

    Person C is making a different claim:

    “You cannot estimate a 99.9% accuracy rate from personal impressions.”

    This is also correct.

    Suppose you ask 100 questions and all appear correct.

    That does not justify concluding the true error rate is 0.1%.

    Why?

    Because: you may not know enough to verify every answer,

    you may be unconsciously selecting easy questions,

    many errors are subtle rather than obvious, and a small sample size tells you little about rare failures.

    This is exactly why benchmark testing exists. If somebody claimed:

    “Commercial aircraft engines fail only once every 100,000 flights because I’ve flown 20 times and never seen one fail,” we would reject that reasoning.

    Person C is essentially pointing out a sampling problem.

    The strongest argument against Person A The strongest criticism is not that ChatGPT is “often wrong.”

    It’s that:

    Humans are surprisingly bad at detecting when ChatGPT is wrong.

    Many AI errors look plausible.

    For example, if I invent:

    a fake court case,

    a fake academic paper,

    a fake quote,

    a nonexistent historical anecdote,

    the answer can sound extremely convincing.

    A casual user may never discover the mistake. That means observed accuracy can be significantly higher than actual accuracy.

    What experiment would actually settle this? Person A asks a fair question:

    “How much evidence would convince you?” A proper experiment would look something like:

    Construct a large test set (thousands of questions).

    Cover many domains:

    history

    law

    medicine

    physics

    mathematics

    current events

    geography

    programming

    Define correct answers beforehand.

    Blindly score responses.

    Measure:

    correctness,

    completeness,

    confidence calibration.

    This is basically what AI benchmark researchers do.

    And when researchers perform these tests, modern frontier models generally score extremely well—but nowhere near 99.9% across all domains.

    Performance varies dramatically by task. Can someone produce a question that reliably causes errors?

    This is harder than Person A assumes. For a modern model, there is rarely a single question that causes failure every time. A better challenge would be categories such as:

    obscure local regulations,

    highly specialized legal questions,

    recent events,

    nonexistent citations,

    invented books or papers,

    long chains of reasoning,

    edge-case programming problems,

    ambiguous historical claims.

    Models are much more likely to fail there. But that doesn’t mean they fail on every attempt.

    Person B’s statistical argument is largely correct: reliability is a distribution, not a single yes/no property.

    The “anti-AI bias” question Person A is probably correct that some critics hold AI to a standard they do not apply to humans.

    If a person:

    answers 95% correctly,

    speaks fluently on thousands of topics,

    writes code,

    explains science,

    summarizes books,

    we’d consider them exceptionally knowledgeable.

    Many critics compare AI not against average humans but against:

    experts,

    reference works,

    or idealized notions of truth.

    However, the opposite mistake also exists. Some AI enthusiasts compare ChatGPT against:

    random Reddit comments,

    random Google searches,

    uninformed acquaintances.

    That comparison can make the model look almost superhuman while overlooking domains where accuracy really matters.

    Overall verdict

    If I had to judge the arguments:

    Person C has the stronger methodological argument. You cannot infer 99–99.9% accuracy from personal experience.

    Person A has the stronger practical argument.

    Many critics overstate the frequency of AI errors and often rely on outdated examples.

    Person B is probably closest to the core issue.

    The meaningful question isn’t “Can someone find one wrong answer?” but “What is the model’s error rate across different classes of tasks?”

    So if this were a debate, I’d say:

    Person A is probably correct that modern ChatGPT is much more reliable than many critics claim.

    Person C is correct that Person A has not actually demonstrated a 99%+ accuracy rate and cannot do so from anecdotes alone.

    Those positions are compatible rather than contradictory.


    Do with that as you will.




  • There’s a lot of things that lawmakers put into law to protect people from their own dumbass decisions. Places where wearing seatbelts are mandatory have less car related deaths, same with helmets on motorbikes. Both things people should have the common sense to do without laws, but they don’t. Furthermore, places where pool fencing is mandatory have less child deaths due to drowning, but that doesn’t stop some people from not having a pool fence where it’s not mandatory. There’s hundreds of “common sense” things like these, that if they weren’t actual law would be completely ignored.

    So actual protections for children’s use of the internet being made into law isn’t necessarily a bad thing in and of itself. And if be all for them if they were reasonable and realistic, but they never are. No matter how much you want to make it so, expecting everyone to do reasonable things to protect themselves and those dependent on them without some sort of incentive is unrealistic.

    Of course in saying all that, banning VPNs and all the laws people want to implement similar to it, have nothing to do with protecting children and everything to do with controlling people.