Table of Contents
Openai
curl https://api.openai.com/v1/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $OPENAI_API_KEY" \ -d '{ "model": "text-davinci-003", "prompt": "The following is a conversation with an AI assistant. The assistant is helpful, creative, clever, and very friendly.\n\nHuman: Hello, who are you?\nAI: I am an AI created by OpenAI. How can I help you today?\nHuman: I'd like to cancel my subscription.\nAI:", "temperature": 0.9, "max_tokens": 150, "top_p": 1, "frequency_penalty": 0.0, "presence_penalty": 0.6, "stop": [" Human:", " AI:"] }'
Separado en petición, creamos el fichero peticion.json
{ "model": "text-davinci-003", "prompt": "The following is a conversation with an AI assistant. The assistant is helpful, creative, clever, and very friendly.\n\nHuman: Hello, who are you?\nAI: I am an AI created by OpenAI. How can I help you today?\nHuman: I'd like to cancel my subscription.\nAI:", "temperature": 0.9, "max_tokens": 150, "top_p": 1, "frequency_penalty": 0.0, "presence_penalty": 0.6, "stop": [" Human:", " AI:"] }
Y lanzamos la petición:
curl -X POST https://api.openai.com/v1/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $OPENAI_API_KEY" \ -d @peticion.json
Entrenar un modelo
https://platform.openai.com/docs/guides/fine-tuning
Hay que generar un fichero json con duplas de preguntas y respuestas, que se llaman prompt y completion.
Instalamos openai-cli. Creamos un fichero csv con el siguiente formato:
prompt,completion "nombre de la víctima","Pedro" "posible sospechoso","Roberto" "pista encontrada","una llave en el suelo al lado de la puerta de la entrada" "pista encontrada","unos zapatos que no pertenecen a la víctima en la habitación de la víctima" "pista encontrada","un mensaje de texto en el móvil de la víctima" "novia de Pedro","Sandra en los años 2016,2017" "novia de Pedro","Laura en el año 2022" "novia de Roberto","Laura en los años 2018,2019,2020,2021"
Lo cambiamos al formato JSON. Te dice de añadir unos separadaros, decíamos que si. Lanzamos este comando:
openai tools fine_tunes.prepare_data -f fichero.csv
Te crea el siguiente fichero:
{"prompt":"nombre de la víctima ->","completion":" Pedro\n"} {"prompt":"posible sospechoso ->","completion":" Roberto\n"} {"prompt":"pista encontrada ->","completion":" una llave en el suelo al lado de la puerta de la entrada\n"} {"prompt":"pista encontrada ->","completion":" unos zapatos que no pertenecen a la víctima en la habitación de la víctima\n"} {"prompt":"pista encontrada ->","completion":" un mensaje de texto en el móvil de la víctima\n"} {"prompt":"novia de Pedro ->","completion":" Sandra en los años 2016,2017\n"} {"prompt":"novia de Pedro ->","completion":" Laura en el año 2022\n"} {"prompt":"novia de Roberto ->","completion":" Laura en los años 2018,2019,2020,2021\n"}
Lo subimos a openai al modelo ada (que es el barato). Los modelos son: ada, babbage, curie, davinci
openai api fine_tunes.create -t entrenando_01_prepared.jsonl -m ada
Upload progress: 100%|████████████████████████████████████████████| 699/699 [00:00<00:00, 387kit/s] Uploaded file from entrenando_01_prepared.jsonl: file-MNz4rv9kV8jpbiRveTXA76YG Created fine-tune: ft-V8Seq1neyFcJOncJDRUbabWD Streaming events until fine-tuning is complete... (Ctrl-C will interrupt the stream, but not cancel the fine-tune) [2023-02-27 19:22:47] Created fine-tune: ft-V8Seq1neyFcJOncJDRUbabWD Stream interrupted (client disconnected). To resume the stream, run: openai api fine_tunes.follow -i ft-V8Seq1neyFcJOncJDRUbabWD
Podemos ver el estado de la petición:
openai api fine_tunes.get -i ft-V8Seq1neyFcJOncJDRUbabWD
{ "created_at": 1677525767, "events": [ { "created_at": 1677525767, "level": "info", "message": "Created fine-tune: ft-V8Seq1neyFcJOncJDRUbabWD", "object": "fine-tune-event" } ], "fine_tuned_model": null, "hyperparams": { "batch_size": null, "learning_rate_multiplier": null, "n_epochs": 4, "prompt_loss_weight": 0.01 }, "id": "ft-V8Seq1neyFcJOncJDRUbabWD", "model": "ada", "object": "fine-tune", "organization_id": "org-W85oba51ZpI7Keymmpa2exBj", "result_files": [], "status": "pending", "training_files": [ { "bytes": 699, "created_at": 1677525767, "filename": "entrenando_01_prepared.jsonl", "id": "file-MNz4rv9kV8jpbiRveTXA76YG", "object": "file", "purpose": "fine-tune", "status": "processed", "status_details": null } ], "updated_at": 1677525767, "validation_files": [] }
Status está pending, cuando está completed:
{ "created_at": 1677525767, "events": [ { "created_at": 1677525767, "level": "info", "message": "Created fine-tune: ft-V8Seq1neyFcJOncJDRUbabWD", "object": "fine-tune-event" }, { "created_at": 1677526162, "level": "info", "message": "Fine-tune costs $0.00", "object": "fine-tune-event" }, { "created_at": 1677526162, "level": "info", "message": "Fine-tune enqueued. Queue number: 0", "object": "fine-tune-event" }, { "created_at": 1677526164, "level": "info", "message": "Fine-tune started", "object": "fine-tune-event" }, { "created_at": 1677526178, "level": "info", "message": "Completed epoch 1/4", "object": "fine-tune-event" }, { "created_at": 1677526180, "level": "info", "message": "Completed epoch 2/4", "object": "fine-tune-event" }, { "created_at": 1677526181, "level": "info", "message": "Completed epoch 3/4", "object": "fine-tune-event" }, { "created_at": 1677526182, "level": "info", "message": "Completed epoch 4/4", "object": "fine-tune-event" }, { "created_at": 1677526205, "level": "info", "message": "Uploaded model: ada:ft-iwanttobefreak-2023-02-27-19-30-05", "object": "fine-tune-event" }, { "created_at": 1677526208, "level": "info", "message": "Uploaded result file: file-goxBKlVtpq8p0X4otjLCFh0F", "object": "fine-tune-event" }, { "created_at": 1677526208, "level": "info", "message": "Fine-tune succeeded", "object": "fine-tune-event" } ], "fine_tuned_model": "ada:ft-iwanttobefreak-2023-02-27-19-30-05", "hyperparams": { "batch_size": 1, "learning_rate_multiplier": 0.1, "n_epochs": 4, "prompt_loss_weight": 0.01 }, "id": "ft-V8Seq1neyFcJOncJDRUbabWD", "model": "ada", "object": "fine-tune", "organization_id": "org-W85oba51ZpI7Keymmpa2exBj", "result_files": [ { "bytes": 1545, "created_at": 1677526206, "filename": "compiled_results.csv", "id": "file-goxBKlVtpq8p0X4otjLCFh0F", "object": "file", "purpose": "fine-tune-results", "status": "processed", "status_details": null } ], "status": "succeeded", "training_files": [ { "bytes": 699, "created_at": 1677525767, "filename": "entrenando_01_prepared.jsonl", "id": "file-MNz4rv9kV8jpbiRveTXA76YG", "object": "file", "purpose": "fine-tune", "status": "processed", "status_details": null } ], "updated_at": 1677526208, "validation_files": [] }
Ahora si que nos lista ya nuestro modelo:
openai api fine_tunes.list
{ "data": [ { "created_at": 1677525767, "fine_tuned_model": "ada:ft-iwanttobefreak-2023-02-27-19-30-05", "hyperparams": { "batch_size": 1, "learning_rate_multiplier": 0.1, "n_epochs": 4, "prompt_loss_weight": 0.01 }, "id": "ft-V8Seq1neyFcJOncJDRUbabWD", "model": "ada", "object": "fine-tune", "organization_id": "org-W85oba51ZpI7Keymmpa2exBj", "result_files": [ { "bytes": 1545, "created_at": 1677526206, "filename": "compiled_results.csv", "id": "file-goxBKlVtpq8p0X4otjLCFh0F", "object": "file", "purpose": "fine-tune-results", "status": "processed", "status_details": null } ], "status": "succeeded", "training_files": [ { "bytes": 699, "created_at": 1677525767, "filename": "entrenando_01_prepared.jsonl", "id": "file-MNz4rv9kV8jpbiRveTXA76YG", "object": "file", "purpose": "fine-tune", "status": "processed", "status_details": null } ], "updated_at": 1677526208, "validation_files": [] } ], "object": "list" }
Con ada parece un poco mojón….
curl https://api.openai.com/v1/completions -H "Authorization: Bearer $OPENAI_API_KEY" -H "Content-Type: application/json" -d '{"prompt": "¿Como se llama la víctima?", "model": "ada:ft-iwanttobefreak-2023-02-27-19-30-05"}'
{ "id": "cmpl-6odiKxpEHTmTzmKrpAnqMbQpB77M0", "object": "text_completion", "created": 1677527080, "model": "ada:ft-iwanttobefreak-2023-02-27-19-30-05", "choices": [ { "text": "\n\n—Monócrata Núrida.\n\n—", "index": 0, "logprobs": null, "finish_reason": "length" } ], "usage": { "prompt_tokens": 13, "completion_tokens": 15, "total_tokens": 28 } }
Le he vuelto a hacer la pregunta y me ha dicho:
Toni Toni señaló la puerta
Con davinci tarda mas, pero sigue dando respuestas que no tienen nada que ver.
Amigo de Jorge
Pilla textos de Juego de tronos y le puedes preguntar. Es en inglés
https://huggingface.co/deepset/roberta-base-squad2
Varios
NLP: Servicio de procesamiento de lenguaje natural
Varios servicios Machine Learning de Amazon:
https://aws.amazon.com/es/free/machine-learning/
Nueva API
En Marzo de 2023 salió el nuevo modelo gpt-3.5-turbo con su API
Fuente:
https://platform.openai.com/docs/guides/chat/introduction
https://github.com/openai/openai-cookbook/blob/main/examples/How_to_format_inputs_to_ChatGPT_models.ipynb
Exportamos la key en bash:
export API_KEY='sk-aslasdjkasldjasldjasldjkasldkj'
Dede python:
import openai response = openai.ChatCompletion.create( model="gpt-3.5-turbo", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Who won the world series in 2020?"}, {"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."}, {"role": "user", "content": "Where was it played?"} ] ) print(response)
Respuesta:
{ "choices": [ { "finish_reason": "stop", "index": 0, "message": { "content": "The 2020 World Series was played at a neutral site due to the COVID-19 pandemic. The games were played at Globe Life Field in Arlington, Texas.", "role": "assistant" } } ], "created": 1677797684, "id": "chatcmpl-6pm6uk0cGDiLdRTq8GB8vy90AWzbb", "model": "gpt-3.5-turbo-0301", "object": "chat.completion", "usage": { "completion_tokens": 35, "prompt_tokens": 56, "total_tokens": 91 } }
Con curl:
curl https://api.openai.com/v1/chat/completions -H "Authorization: Bearer $API_KEY" -H "Content-Type: application/json" -d '{ "model": "gpt-3.5-turbo", "messages": [{"role": "user", "content": "Inventate una historia de un asesinato en Carabanchel"}] }'