A simple visual comparison
Your comment makes Tom's case for him about why we don't evaluate forecasts based on simple visual comparisons. Anyone can look at the maps and set their own subjective standard for what counts as "a good match." One person might not be satisfied unless it were a perfect match (though this is impossible to achieve because some aspects of the climate system are fundamentally not predictable and never will be). Another person might consider it a good forecast if at least 80 percent of the maps match, another might feel like anything greater than 50 percent is good.
My simple visual comparison of these maps concludes that the temp forecast verified for Texas, Louisiana, Missouri, Mississippi, Alabama, Georgia, Florida, South Carolina, North Carolina, Tennessee, Kentucky, West Virginia, Virginia, Maryland, Delaware, Pennsylvania, New Jersey, New York, Connecticut, Rhode Island, Vermont, New Hampshire, Maine, most of Washington, Oregon, Idaho, Montana, and Wyoming, and around half of North Dakota, South Dakota, and Nebraska.
That's far from perfect, but that's a lot of states! It's definitely not gaslighting. Personally, I would call that pretty good! You might call it lousy. That's why they don't score forecasts based on people's subjective judgments.
The HSS provides an objective, statistically consistent way to rank them. A score of zero means no better than you'd expect from guessing "above" a third of the time, "below" a third of the time, and "average" a third of the time. Anything above that means there was some skill. Now sure, we can argue about just how far above zero it needs to be to be practically useful for a particular purpose, but it definitely doesn't have to be perfect to be useful.