Discussion:
Open source archives hosting malicious software packages
(too old to reply)
James E Keenan
2017-09-15 23:11:49 UTC
Permalink
http://www.theregister.co.uk/2017/09/15/pretend_python_packages_prey_on_poor_typing/

Would CPAN be subject to the same problem as described in the article above?
David Cantrell
2017-09-20 14:33:53 UTC
Permalink
Post by James E Keenan
http://www.theregister.co.uk/2017/09/15/pretend_python_packages_prey_on_poor_typing/
Would CPAN be subject to the same problem as described in the article above?
Yes.

DBI::Class, for example, could be a typo for DBIx::Class or a
misremembered Class::DBI, and there's nothing stopping anyone from
uploading a DBI::Class package that does all kinds of dodgy stuff.
--
David Cantrell | semi-evolved ape-thing

Longum iter est per praecepta, breve et efficax per exempla.
James E Keenan
2017-09-20 22:08:34 UTC
Permalink
Post by David Cantrell
http://www.theregister.co.uk/2017/09/15/pretend_python_packages_prey_on_poor_typing/Would CPAN be subject to the same problem as described in the article above?
Yes.
DBI::Class, for example, could be a typo for DBIx::Class or a
misremembered Class::DBI, and there's nothing stopping anyone from
uploading a DBI::Class package that does all kinds of dodgy stuff.
There are plenty of confusable (small edit distance) pairs of module names on CPAN.
For example,
Algorithm::SVM and Algorithm::VSM
AI::POS and AI::PSO
both pairs are from different dists. More likely with short acronyms.
One thing we could do is have a tool looking at newly registered package names and alert the PAUSE admins to have a look at any that are a short edit distance from an existing package name.
Would anyone know of any prior art for detection of "short edit
distances"? (Perhaps even already on CPAN?)

Thank you very much.
Jim Keenan
Zefram
2017-09-20 22:14:20 UTC
Permalink
Would anyone know of any prior art for detection of "short edit distances"?
(Perhaps even already on CPAN?)
Text::Levenshtein.

-zefram
David Precious
2017-09-20 22:13:50 UTC
Permalink
On Wed, 20 Sep 2017 18:08:34 -0400
Post by James E Keenan
One thing we could do is have a tool looking at newly registered
package names and alert the PAUSE admins to have a look at any that
are a short edit distance from an existing package name.
Would anyone know of any prior art for detection of "short edit
distances"? (Perhaps even already on CPAN?)
Isn't that just the Levenshtein distance? So e.g.
Neil's Text::Levenshtein?

One thing I thing is good to consider is the fact that all CPAN releases
get announced on a quite populated IRC channel, increasing the chance of
someone spotting a release announcement and thinking "hmm, that looks
dodgy" - but that's of course not entirely reliable, and doesn't focus
only on new releases.
David Cantrell
2017-09-21 12:11:11 UTC
Permalink
Post by David Precious
One thing I thing is good to consider is the fact that all CPAN releases
get announced on a quite populated IRC channel, increasing the chance of
someone spotting a release announcement and thinking "hmm, that looks
dodgy" - but that's of course not entirely reliable, and doesn't focus
only on new releases.
But is anyone paying attention? I assume you're talking about
#cpantesters, which I'm on, but I hardly ever look at it, and when I do
look I certainly don't look at scrollback, let alone looking at
scrollback *carefully*.
--
David Cantrell | Godless Liberal Elitist

Planckton: n, the smallest possible living thing
Loading...