3.1 million malicious fake stars discovered on GitHub – And the trend is rising
In a comprehensive study, a US research team has discovered millions of fake stars on GitHub and warns of a rapidly growing trend.
(Image: erstellt mit Dall-E durch iX)
A US research team has uncovered millions of stars on GitHub as suspicious. The scam is used to promote malicious activities to distribute phishing, cheats, crypto bots and ultimately malware. The researchers developed the StarScout tool to analyze this. As a conclusion, they warn against trusting star ratings as the sole quality feature and recommend that other criteria should be considered.
In their study, the researchers from Carnegie Mellon University, Socket Inc. and North Carolina State University analyzed all events from the last five years in the GH Archive, a BigQuery warehouse that archives all GitHub events: 60.54 million users, 0.31 billion repositories, 0.61 billion stars and 6.01 billion other events – 20 terabytes of data. The team looked for abnormal behavior in the activities, in particular little activity, but agreeing with other accounts. The method used by the team is based on CopyCatch, which was originally used to detect fake likes on Facebook.
(Image:Â Hao He u.a./Arxiv.org)
In a first step, StarScout discovered 4.53 million fake stars in 22,915 repos created by 1.32 million accounts. Of these fakes, 0.95 million showed a low level of activity and 3.58 fell under the "like" characteristic. After cleaning the data, 15,835 repos with 3.1 million fakes from 278,000 accounts remained. Most of these repos are very short-lived and only exist for a few days. More than 60 percent of them do nothing but distribute fake stars.
Videos by heise
There have already been problems with contaminated repositories and fake star campaigns in the past.
Sharp increase in the last year
The researchers highlight as particularly worrying that fake stars increased a hundredfold in 2024 compared to the previous year. In particular, the team identified the spread of phishing, game cheats and crypto bots as the target of the campaigns. But "it is likely that they are in fact malware lures." Faking asterisks, on the other hand, is of no use to normal repos, as they only lead to a very short-term increase in real asterisks. On the contrary: "After two months, fake stars tend to have a negative effect (especially less gain in real stars)."
(Image:Â Hao He u.a./Arxiv.org)
In conclusion, the researchers write: "The star count is an unreliable indicator of quality and should not be used for higher-value decisions, at least not on its own." It is essential to evaluate other features, such as the Open Source Security Foundation's scorecard. After all, GitHub quickly deleted all the remaining repositories reported by the research group.
(who)