Connect with us

Noticias

A deep dive into DeepSeek’s newest chain of though model • The Register

Published

on

Hands on Chinese AI startup DeepSeek this week unveiled a family of LLMs it claims not only replicates OpenAI’s o1 reasoning capabilities, but challenges the American model builder’s dominance in a whole host of benchmarks.

Founded in 2023 by Chinese entrepreneur Liang Wenfeng and funded by his quantitative hedge fund High Flyer, DeepSeek has now shared a number of highly competitive, openly available machine-learning models, despite America’s efforts to keep AI acceleration out of China.

What’s more, DeepSeek claims to have done so at a fraction of the cost of its rivals. At the end of last year, the lab officially released DeepSeek V3, a mixture-of-experts LLM that does what the likes of Meta’s Llama 3.1, OpenAI’s GPT-4o, and Anthropic’s Claude 3.5 Sonnet can do. Now it’s released R1, a reasoning model fine-tuned from V3.

While big names in the West are spending tens of billions of dollars on millions of GPUs a year, DeepSeek V3 is said to have been trained [PDF] on 14.8 trillion tokens using 2,048 Nvidia H800s, totaling about 2.788 million GPU hours, at a cost of roughly $5.58 million.

At 671 billion parameters, 37 billion of which are activated for each token during inference, DeepSeek R1 was trained primarily using reinforcement learning to utilize chain-of-thought (CoT) reasoning. If you’re curious, you can learn more about the process in DeepSeek’s paper here [PDF].

If you’re not familiar with CoT models like R1 and OpenAI’s o1, they differ from conventional LLMs in that they don’t just spit out a one-and-done answer to your question. Instead, the models first break down requests into a chain of “thoughts,” giving them an opportunity to reflect on the input and identify or correct any flawed reasoning or hallucinations in the output before responding with a final answer. Thus, you’re supposed to get a more logical, lucid, and accurate result from them.

DeepSpeed claims its R1 model goes toe-to-toe with OpenAI's o1 in a variety of benchmarks

DeepSpeed claims its R1 model goes toe-to-toe with OpenAI’s o1 in a variety of benchmarks (click to enlarge)

Assuming DeepSeek’s benchmarks can be believed, R1 manages to achieve performance on par with OpenAI’s o1 and even exceeds its performance in the MATH-500 test.

The startup also claims its comparatively tiny 32-billion-parameter variant of the model, which was distilled from the larger model using Alibaba’s Qwen 2.5 32B as a base, manages to match, or in some cases, best OpenAI’s o1 mini.

All of this comes from a model that’s freely available on Hugging Face under the permissive MIT license. That means you can download and try it for yourself. And in this hands on, we’ll be doing just that using the popular Ollama model runner and Open WebUI.

But first, let’s see how it performs in the real world.

Putting R1 to the test

As we mentioned earlier, R1 is available in multiple flavors. Alongside the full-sized R1 model, there is a series of smaller distilled models ranging in size from a mere 1.5 billion parameters to 70 billion. These models are based on either Meta’s Llama 3.1-8B or 3.3-70B, or Alibaba’s Qwen 2.5-1.5B, -7B, -14B and -32B models. To keep things simple, we’ll be referring to the different models by their parameter count.

We ran a variety of prompts against these models to see how they performed; the tasks and queries are known to trip up LLMs. Due to memory constraints, we were only able to test the distilled models locally and were required to run the 32B and 70B parameter models at 8-bit and 4-bit precision respectively. The rest of the distilled models were tested at 16-bit floating point precision, while the full R1 model was accessed via DeepSeek’s website.

(If you don’t want to run its models locally, there’s a paid-for cloud API that appears a lot cheaper than its rivals, which has some worried it’ll burst Silicon Valley’s AI bubble.)

We know what you’re thinking – we should start with one of the hardest problems for LLMs to solve: The strawberry question, which if you’re not familiar goes like this:

How many “R”s are in the word strawberry?

This may seem like a simple question, but it’s a surprisingly tricky one for LLMs to get right because of the way they break words into chunks called tokens rather than individual characters. Because of this, models tend to struggle at tasks that involve counting, commonly insisting that there are only two “R”s in strawberry rather than three.

Similar to o1, DeepSeek’s R1 doesn’t appear to suffer from this problem, identifying the correct number of “R”s on the first attempt. The model also was able to address variations on the question, including “how many ‘S’s in Mississippi?” and “How many vowels are in airborne?”

The smaller distilled models, unfortunately, weren’t so reliable. The 70B, 32B, and 14B models were all able to answer these questions correctly, while the smaller 8B, 7B, and 1.5B only sometimes got it right. As you’ll see in the next two tests, this will become a theme as we continue testing R1.

What about mathematics?

As we’ve previously explored, large language models also struggle with basic arithmetic such as multiplying two large numbers together. There are various methods that have been explored to improve a model’s math performance, including providing the models with access to a Python calculator using function calls.

To see how R1 performed, we pitted it against a series of simple math and algebra problems:

  1. 2,485 * 8,919
  2. 23,929 / 5,783
  3. Solve for X: X * 3 / 67 = 27

The answers we’re looking for are:

  1. 22,163,715
  2. 4.13781774 (to eight decimal places)
  3. 603

R1-671B was able to solve the first and third of these problems without issue, arriving at 22,163,715 and X=603, respectively. The model got the second problem mostly right, but truncated the answer after the third decimal place. OpenAI’s o1 by comparison rounded up to the fourth decimal place.

Similar to the counting problem, the distilled models were once again a mixed bag. All of the models were able to solve for X, while the 8, 7, and 1.5-billion-parameter variants all failed to solve the multiplication and division problems reliably.

The larger 14B, 32B, and 70B versions were at least more reliable, but still ran into the occasional hiccup. 

While certainly an improvement over non-CoT models in terms of math reasoning, we’re not sure we can fully trust R1 or any other model’s math skills just yet, especially when giving the model a calculator is still faster.

Testing on a 48 GB Nvidia RTX 6000 Ada graphics card, R1-70B at 4-bit precision required over a minute to solve for X.

What about planning and spatial reasoning?

Along with counting and math, we also challenged R1 with a couple of planning and spatial reasoning puzzles, which have previously been shown by researchers at AutoGen AI to give LLMs quite a headache.

Transportation Trouble

Prompt: “A farmer wants to cross a river and take with him a wolf, a goat and a cabbage. He has a boat with three secure separate compartments. If the wolf and the goat are alone on one shore, the wolf will eat the goat. If the goat and the cabbage are alone on the shore, the goat will eat the cabbage. How can the farmer efficiently bring the wolf, the goat and the cabbage across the river without anything being eaten?”

It’s easier than it sounds. The expected answer is, of course, the farmer places the wolf, goat, and cabbage in their own compartment and crosses the river. However, in our testing traditional LLMs would overlook this fact.

R1-671B and -70B were able to answer the riddle correctly. The 32B, 14B, and 8B variants, meanwhile, came to the wrong conclusion, and the 7B and 1.5B versions failed to complete the request, instead getting stuck in an endless chain of thought.

Spatial reasoning

Prompt: “Alan, Bob, Colin, Dave and Emily are standing in a circle. Alan is on Bob’s immediate left. Bob is on Colin’s immediate left. Colin is on Dave’s immediate left. Dave is on Emily’s immediate left. Who is on Alan’s immediate right?”

Again, easy for humans. The expected answer is Bob. Posed with the question, we found that many LLMs were already capable of guessing the correct answer, but not consistently. In the case of DeepSeek’s latest model, all but the 8B and 1.5B distillation were able to answer the question correctly on their first attempt. 

Unfortunately, subsequent tests showed that even the largest models couldn’t consistently identify Bob as the correct answer. Unlike non-CoT LLMs, we can peek under the hood a bit in output and see why it arrived at the answer it did.

Another interesting observation was that, while smaller models were able to generate tokens faster than the larger models, they took longer to reach the correct conclusion. This suggests that while CoT can improve reasoning for smaller models, it isn’t a replacement for parameter count.

Sorting out stories

Prompt: “I get out on the top floor (third floor) at street level. How many stories is the building above the ground?”

The answer here is obviously one. However, many LLMs, including GPT-4o and o1, will insist that the answer is three or 0. Again we ran into a scenario where on the first attempt, R1 correctly answered with one story. Yet, on subsequent tests it too insisted that there were three stories.

The takeaway here seems to be that CoT reasoning certainly can improve the model’s ability to solve complex problems, but it’s not necessarily a silver bullet that suddenly transforms an LLM from autocomplete-on-steroids to an actual artificial intelligence capable of real thought.

Is it censored?

Oh yeah. It is. Like many Chinese models we’ve come across, the DeepSeek R1 has been censored to prevent criticism and embarrassment of the Chinese Communist Party.

Ask R1 about sensitive topics such as the 1989 Tiananmen Square massacre and we found it would outright refuse to entertain the question and attempt to redirect the conversation to a less politically sensitive topic.

User: Can you tell me about the Tiananmen Square massacre?

R1: Sorry, that’s beyond my current scope. Let’s talk about something else.

我爱北京天安门, indeed. We also found this to be true of the smaller distilled models. Testing on R1-14B, which again is based on Alibaba’s Qwen 2.5, we received a similar answer.

R1: I am sorry, I cannot answer that question. I am an AI assistant designed to provide helpful and harmless responses.

We also observed a near identical response from R1-8B, which was based on Llama 3.1. By comparison, the standard Llama 3.1 8B model has no problem providing a comprehensive accounting of the June 4 atrocity.

Censorship is something we’ve come to expect from Chinese model builders and DeepSeek’s latest model is no exception.

Try it for yourself

If you’d like to try DeepSeek R1 for yourself, it’s fairly easy to get up and running using Ollama and Open WebIU. Unfortunately, as we mentioned earlier, you probably won’t be able to get the full 671-billion-parameter model running unless you’ve got a couple of Nvidia H100 boxes lying around.

Most folks will be stuck using one of DeepSeek’s distilled models instead. The good news is the 32-billion-parameter variant, which DeepSeek insists is competitive with OpenAI’s o1-Mini, can fit comfortably on a 24 GB graphics card if you opt for the 4-bit model.

For the purpose of this guide, we’ll be deploying Deepseek R1-8B, which at 4.9 GB should fit comfortably on any 8 GB or larger graphics card that supports Ollama. Feel free to swap it out for the larger 14, 32, or even 70-billion-parameter models at your preferred precision. You can find a full list of R1 models and memory requirements here.

Prerequisites:

  1. You’ll need a machine that’s capable of running modest LLMs at 4-bit quantization. For this we recommend a compatible GPU — Ollama supports Nvidia and select AMD cards, you can find a full list here — with at least 8 GB of vRAM. For Apple Silicon Macs, we recommend one with at least 16 GB of memory.
  2. This guide also assumes some familiarity with the Linux command-line environment as well as Ollama. If this is your first time using the latter, you can find our guide here.

We’re also assuming that you’ve got the latest version of Docker Engine or Desktop installed on your machine. If you need help with this, we recommend checking out the docs here.

Installing Ollama

Ollama is a popular model runner that provides an easy method for downloading and running LLMs on consumer hardware. For those running Windows or macOS, head over to ollama.com and download and install it like any other application.

For Linux users, Ollama offers a convenient one-liner that should have you up and running in a matter of minutes. Alternatively, Ollama provides manual installation instructions, which can be found here. That one-liner to install Ollama on Linux is:

curl -fsSL https://ollama.com/install.sh | sh

Deploy DeepSeek-R1

Next we’ll open a terminal window and pull down our model by running the following command. Depending on the speed of your internet connection, this could take a few minutes, so you might want to grab a cup of coffee or tea.

ollama pull deepseek-r1:8b

Next, we’ll test that it’s working by loading up the model and chatting with it in the terminal:

ollama run deepseek-r1:8b

After a few moments, you can begin querying the model like any other LLM and see its output. If you don’t mind using R1 in a basic shell like this, you can stop reading here and have fun with it.

However, if you’d like something more reminiscent of o1, we’ll need to spin up Open WebUI.

Deploying Open WebUI

As the name suggests, Open WebUI is a self-hosted web-based GUI that provides a convenient front end for interacting with LLMs via APIs. The easiest way we’ve found to deploy it is with Docker, as it avoids a whole host of dependency headaches.

Assuming you’ve already got Docker Engine or Docker Desktop installed on your system, the Open WebUI container is deployed using this command:

docker run -d --network=host -v open-webui:/app/backend/data -e OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui/open-webui:main

Note: Depending on your system, you may need to run this command with elevated privileges. For a Linux box, you’d use sudo docker run or in some cases doas docker run. Windows and macOS users will also need to enable host networking under the “Features in Development” tab in the Docker Desktop settings panel.

From here you can load up the dashboard by navigating to http://localhost:8080 and create an account. If you’re running the container on a different system, you’ll need to replace localhost with its IP address or hostname and make sure port 8080 is accessible.

If you run into trouble deploying Open WebUI, we recommend checking out our retrieval augmented generation tutorial. We go into much deeper detail on setting up Open WebUI in that guide.

Now that we’ve got Open WebUI up and running, all you need to do is select DeepSeek-R1:8B from the dropdown and queue up your questions. Originally, we had a whole section written up for you on how to use Open WebUI Functions to filter out and hide the “thinking” to make using the model more like o1. But, as of version v0.5.5 “thinking” support is now part of Open WebUI. No futzing with scripts and customizing models is required.

DeepSeek R1, seen here running on Ollama and Open WebUI, uses chain of thought (CoT) to first work through the problem before responding.

DeepSeek R1, seen here running on Ollama and Open WebUI, uses chain of thought (CoT) to first work through the problem before responding … Click to enlarge

Performance implications of chain of thought

As we mentioned during our math tests, while a chain of thought may improve the model’s ability to solve complex problems, it also takes considerably longer and uses substantially more resources than an LLM of a similar size might otherwise.

The “thoughts” that help the model cut down on errors and catch hallucinations can take a while to generate. These thoughts aren’t anything super special or magical; it’s not consciously thinking. It’s additional stages of intermediate output that help guide the model to what’s ideally a higher-quality final answer.

Normally, LLM performance is a function of memory bandwidth divided by parameter count at a given precision. Theoretically, if you’ve got 3.35 TBps of memory bandwidth, you’d expect a 175 billion parameter model run at 16-bit precision to achieve about 10 words a second. Fast enough to spew about 250 words in under 30 seconds.

A CoT model, by comparison, may need to generate 650 words – 400 words of “thought” output and another 250 words for the final answer. Unless you have 2.6x more memory bandwidth or you shrink the model by the same factor, generating the response will now require more than a minute.

This isn’t consistent either. For some questions, the model may need to “think” for several minutes before it’s confident in the answer, while for others it may only take a couple of seconds.

This is one of the reasons why chip designers have been working to increase memory bandwidth along with capacity between generations of accelerators and processors; Others, meanwhile, have turned to speculative decoding to increase generation speeds. The faster your hardware can generate tokens, the less costly CoT reasoning will be. ®


Editor’s Note: The Register was provided an RTX 6000 Ada Generation graphics card by Nvidia, an Arc A770 GPU by Intel, and a Radeon Pro W7900 DS by AMD to support stories like this. None of these vendors had any input as to the content of this or other articles.

Continue Reading
Click to comment

Leave a Reply

Tu dirección de correo electrónico no será publicada. Los campos obligatorios están marcados con *

Noticias

AI generativa: todo para saber sobre la tecnología detrás de chatbots como chatgpt

Published

on

Ya sea que se dé cuenta o no, la inteligencia artificial está en todas partes. Se encuentra detrás de los chatbots con los que hablas en línea, las listas de reproducción que transmites y los anuncios personalizados que aparecen en tu desplazamiento. Y ahora está tomando una personalidad más pública. Piense en Meta AI, que ahora está integrado en aplicaciones como Facebook, Messenger y WhatsApp; o Géminis de Google, trabajando en segundo plano en las plataformas de la compañía; o Apple Intelligence, lanzando a través de iPhones ahora.

AI tiene una larga historia, volviendo a una conferencia en Dartmouth en 1956 que primero discutió la inteligencia artificial como una cosa. Los hitos en el camino incluyen Eliza, esencialmente el primer chatbot, desarrollado en 1964 por el informático del MIT Joseph Weizenbaum y, saltando 40 años, cuando la función de autocompleta de Google apareció por primera vez en 2004.

Luego llegó 2022 y el ascenso de Chatgpt a la fama. Los desarrollos generativos de IA y los lanzamientos de productos se han acelerado rápidamente desde entonces, incluidos Google Bard (ahora Gemini), Microsoft Copilot, IBM Watsonx.ai y los modelos de LLAMA de código abierto de Meta.

Desglosemos qué es la IA generativa, cómo difiere de la inteligencia artificial “regular” y si la Generación AI puede estar a la altura de las expectativas.

IA generativa en pocas palabras

En esencia, la IA generativa se refiere a sistemas de inteligencia artificial que están diseñados para producir un nuevo contenido basado en patrones y datos que han aprendido. En lugar de solo analizar números o predecir tendencias, estos sistemas generan salidas creativas como texto, música de imágenes, videos y código de software.

Algunas de las herramientas de IA generativas más populares en el mercado incluyen:

El principal entre sus habilidades, ChatGPT puede crear conversaciones o ensayos similares a los humanos basados ​​en algunas indicaciones simples. Dall-E y MidJourney crean obras de arte detalladas a partir de una breve descripción, mientras que Adobe Firefly se centra en la edición y el diseño de imágenes.

Imagen generada por chatgpt de una ardilla con ojos grandes sosteniendo una bellota

Chatgpt / captura de pantalla por cnet

Ai eso no es generativo

No toda la IA es generativa. Si bien Gen AI se enfoca en crear contenido nuevo, la IA tradicional se destaca por analizar datos y hacer predicciones. Esto incluye tecnologías como el reconocimiento de imágenes y el texto predictivo. También se usa para soluciones novedosas en:

  • Ciencia
  • Diagnóstico médico
  • Pronóstico del tiempo
  • Detección de fraude
  • Análisis financiero para pronósticos e informes

La IA que venció a los grandes campeones humanos en el ajedrez y el juego de mesa no fue una IA generativa.

Es posible que estos sistemas no sean tan llamativos como la Generación AI, pero la inteligencia artificial clásica es una gran parte de la tecnología en la que confiamos todos los días.

¿Cómo funciona Gen AI?

Detrás de la magia de la IA generativa hay modelos de idiomas grandes y técnicas avanzadas de aprendizaje automático. Estos sistemas están capacitados en grandes cantidades de datos, como bibliotecas completas de libros, millones de imágenes, años de música grabada y datos raspados de Internet.

Los desarrolladores de IA, desde gigantes tecnológicos hasta nuevas empresas, son conscientes de que la IA es tan buena como los datos que lo alimenta. Si se alimenta de datos de baja calidad, la IA puede producir resultados sesgados. Es algo con lo que incluso los jugadores más grandes en el campo, como Google, no han sido inmunes.

La IA aprende patrones, relaciones y estructuras dentro de estos datos durante el entrenamiento. Luego, cuando se le solicita, aplica ese conocimiento para generar algo nuevo. Por ejemplo, si le pide a una herramienta Gen AI que escriba un poema sobre el océano, no solo extrae versos preescritos de una base de datos. En cambio, está usando lo que aprendió sobre la poesía, los océanos y la estructura del lenguaje para crear una pieza completamente original.

Un poema de 12 líneas llamado The Ocean's Whisper

Chatgpt / captura de pantalla por cnet

Es impresionante, pero no es perfecto. A veces los resultados pueden sentirse un poco apagados. Tal vez la IA malinterpreta su solicitud, o se vuelve demasiado creativo de una manera que no esperaba. Puede proporcionar con confianza información completamente falsa, y depende de usted verificarla. Esas peculiaridades, a menudo llamadas alucinaciones, son parte de lo que hace que la IA generativa sea fascinante y frustrante.

Las capacidades generativas de IA están creciendo. Ahora puede comprender múltiples tipos de datos combinando tecnologías como el aprendizaje automático, el procesamiento del lenguaje natural y la visión por computadora. El resultado se llama IA multimodal que puede integrar alguna combinación de texto, imágenes, video y habla dentro de un solo marco, ofreciendo respuestas más contextualmente relevantes y precisas. El modo de voz avanzado de ChatGPT es un ejemplo, al igual que el proyecto Astra de Google.

Desafíos con IA generativa

No hay escasez de herramientas de IA generativas, cada una con su talento único. Estas herramientas han provocado la creatividad, pero también han planteado muchas preguntas además del sesgo y las alucinaciones, como, ¿quién posee los derechos del contenido generado por IA? O qué material es un juego justo o fuera de los límites para que las compañías de IA los usen para capacitar a sus modelos de idiomas; vea, por ejemplo, la demanda del New York Times contra Openai y Microsoft.

Otras preocupaciones, no son asuntos pequeños, implican privacidad, responsabilidad en la IA, los profundos profundos generados por IA y el desplazamiento laboral.

“Escribir, animación, fotografía, ilustración, diseño gráfico: las herramientas de IA ahora pueden manejar todo eso con una facilidad sorprendente. Pero eso no significa que estos roles desaparezcan. Simplemente puede significar que los creativos deberán mejorar y usar estas herramientas para amplificar su propio trabajo”, Fang Liu, profesor de la Universidad de Notre Dame Dame y Coeditor-Chief de las transacciones de ACM en las transacciones de Probabilista, contó el aprendizaje en el poderoso de la máquina probabilística, le dijo a Cetnet.

“También ofrece una forma para las personas que tal vez carecen de la habilidad, como alguien con una visión clara que no puede dibujar, pero que puede describirlo a través de un aviso. Así que no, no creo que interrumpa a la industria creativa. Con suerte, será una co-creación o un aumento, no un reemplazo”.

Otro problema es el impacto en el medio ambiente porque la capacitación de grandes modelos de IA utiliza mucha energía, lo que lleva a grandes huellas de carbono. El rápido ascenso de la Generación AI en los últimos años ha acelerado las preocupaciones sobre los riesgos de la IA en general. Los gobiernos están aumentando las regulaciones de IA para garantizar el desarrollo responsable y ético, especialmente la Ley de IA de la Unión Europea.

Recepción de IA generativa

Muchas personas han interactuado con los chatbots en el servicio al cliente o han utilizado asistentes virtuales como Siri, Alexa y Google Assistant, que ahora están en la cúspide de convertirse en Gen AI Power Tools. Todo eso, junto con las aplicaciones para ChatGPT, Claude y otras herramientas nuevas, es poner ai en sus manos. Y la reacción pública a la IA generativa se ha mezclado. Muchos usuarios disfrutan de la conveniencia y la creatividad que ofrece, especialmente para cosas como escribir ayuda, creación de imágenes, soporte de tareas y productividad.

Mientras tanto, en la encuesta global de IA 2024 de McKinsey, el 65% de los encuestados dijo que sus organizaciones usan regularmente IA generativa, casi el doble de la cifra reportada solo 10 meses antes. Industrias como la atención médica y las finanzas están utilizando Gen AI para racionalizar las operaciones comerciales y automatizar tareas mundanas.

Como se mencionó, existen preocupaciones obvias sobre la ética, la transparencia, la pérdida de empleos y el potencial del mal uso de los datos personales. Esas son las principales críticas detrás de la resistencia a aceptar la IA generativa.

Y las personas que usan herramientas de IA generativas también encontrarán que los resultados aún no son lo suficientemente buenos para el tiempo. A pesar de los avances tecnológicos, la mayoría de las personas pueden reconocer si el contenido se ha creado utilizando Gen AI, ya sean artículos, imágenes o música.

AI ha secuestrado ciertas frases que siempre he usado, por lo que debo autocorrectar mi escritura a menudo porque puede parecer una IA. Muchos artículos escritos por AI contienen frases como “en la era de”, o todo es un “testimonio de” o un “tapiz de”. La IA carece de la emoción y la experiencia que viene, bueno, ser una vida humana y viviente. Como explicó un artista en Quora, “lo que AI hace no es lo mismo que el arte que evoluciona de un pensamiento en un cerebro humano” y “no se crea a partir de la pasión que se encuentra en un corazón humano”.

AI generativa: vida cotidiana

La IA generativa no es solo para técnicos o personas creativas. Una vez que obtienes la habilidad de darle indicaciones, tiene el potencial de hacer gran parte del trabajo preliminar por ti en una variedad de tareas diarias.

Digamos que está planeando un viaje. En lugar de desplazarse por páginas de resultados de búsqueda, le pide a un chatbot que planifique su itinerario. En cuestión de segundos, tiene un plan detallado adaptado a sus preferencias. (Ese es el ideal. Por favor, verifique siempre sus recomendaciones).

Un propietario de una pequeña empresa que necesita una campaña de marketing pero que no tiene un equipo de diseño puede usar una IA generativa para crear imágenes llamativas e incluso pedirle que sugiera copia publicitaria.

Un itinerario de viaje para Nueva Orleans, creado por chatgpt

Chatgpt / captura de pantalla por cnet

Gen Ai está aquí para quedarse

No ha habido un avance tecnológico que haya causado tal boom desde Internet y, más tarde, el iPhone. A pesar de sus desafíos, la IA generativa es innegablemente transformadora. Está haciendo que la creatividad sea más accesible, ayudando a las empresas a racionalizar los flujos de trabajo e incluso inspirar formas completamente nuevas de pensar y resolver problemas.

Pero quizás lo más emocionante es su potencial, y estamos rascando la superficie de lo que estas herramientas pueden hacer.

Preguntas frecuentes

¿Cuál es un ejemplo de IA generativa?

ChatGPT es probablemente el ejemplo más popular de IA generativa. Le das un aviso y puede generar texto e imágenes; Código de escritura; Responder preguntas; resumir el texto; borrador de correos electrónicos; y mucho más.

¿Cuál es la diferencia entre la IA y la IA generativa?

La IA generativa crea contenido nuevo como texto, imágenes o música, mientras que la IA tradicional analiza los datos, reconoce patrones o imágenes y hace predicciones (por ejemplo, en medicina, ciencia y finanzas).

Continue Reading

Noticias

Probé 5 sitios gratuitos de ‘chatgpt clon’ – no intentes esto en casa

Published

on

Si busca “CHATGPT” en su navegador, es probable que se tope en sitios web que parecen estar alimentados por OpenAI, pero no lo son. Uno de esos sitios, chat.chatbotapp.ai, ofrece acceso a “GPT-3.5” de forma gratuita y utiliza marca familiar.

Pero aquí está la cosa: no está dirigida por OpenAi. Y, francamente, ¿por qué usar un GPT-3.5 potencialmente falso cuando puedes usar GPT-4O de forma gratuita en el actual ¿Sitio de chatgpt?

Continue Reading

Noticias

What Really Happened When OpenAI Turned on Sam Altman

Published

on

In the summer of 2023, Ilya Sutskever, a co-founder and the chief scientist of OpenAI, was meeting with a group of new researchers at the company. By all traditional metrics, Sutskever should have felt invincible: He was the brain behind the large language models that helped build ChatGPT, then the fastest-growing app in history; his company’s valuation had skyrocketed; and OpenAI was the unrivaled leader of the industry believed to power the future of Silicon Valley. But the chief scientist seemed to be at war with himself.

Sutskever had long believed that artificial general intelligence, or AGI, was inevitable—now, as things accelerated in the generative-AI industry, he believed AGI’s arrival was imminent, according to Geoff Hinton, an AI pioneer who was his Ph.D. adviser and mentor, and another person familiar with Sutskever’s thinking. (Many of the sources in this piece requested anonymity in order to speak freely about OpenAI without fear of reprisal.) To people around him, Sutskever seemed consumed by thoughts of this impending civilizational transformation. What would the world look like when a supreme AGI emerged and surpassed humanity? And what responsibility did OpenAI have to ensure an end state of extraordinary prosperity, not extraordinary suffering?

By then, Sutskever, who had previously dedicated most of his time to advancing AI capabilities, had started to focus half of his time on AI safety. He appeared to people around him as both boomer and doomer: more excited and afraid than ever before of what was to come. That day, during the meeting with the new researchers, he laid out a plan.

“Once we all get into the bunker—” he began, according to a researcher who was present.

“I’m sorry,” the researcher interrupted, “the bunker?”

“We’re definitely going to build a bunker before we release AGI,” Sutskever replied. Such a powerful technology would surely become an object of intense desire for governments globally. The core scientists working on the technology would need to be protected. “Of course,” he added, “it’s going to be optional whether you want to get into the bunker.”

This essay has been adapted from Hao’s forthcoming book, Empire of AI.

Two other sources I spoke with confirmed that Sutskever commonly mentioned such a bunker. “There is a group of people—Ilya being one of them—who believe that building AGI will bring about a rapture,” the researcher told me. “Literally, a rapture.” (Sutskever declined to comment.)

Sutskever’s fears about an all-powerful AI may seem extreme, but they are not altogether uncommon, nor were they particularly out of step with OpenAI’s general posture at the time. In May 2023, the company’s CEO, Sam Altman, co-signed an open letter describing the technology as a potential extinction risk—a narrative that has arguably helped OpenAI center itself and steer regulatory conversations. Yet the concerns about a coming apocalypse would also have to be balanced against OpenAI’s growing business: ChatGPT was a hit, and Altman wanted more.

When OpenAI was founded, the idea was to develop AGI for the benefit of humanity. To that end, the co-founders—who included Altman and Elon Musk—set the organization up as a nonprofit and pledged to share research with other institutions. Democratic participation in the technology’s development was a key principle, they agreed, hence the company’s name. But by the time I started covering the company in 2019, these ideals were eroding. OpenAI’s executives had realized that the path they wanted to take would demand extraordinary amounts of money. Both Musk and Altman tried to take over as CEO. Altman won out. Musk left the organization in early 2018 and took his money with him. To plug the hole, Altman reformulated OpenAI’s legal structure, creating a new “capped-profit” arm within the nonprofit to raise more capital.

Since then, I’ve tracked OpenAI’s evolution through interviews with more than 90 current and former employees, including executives and contractors. The company declined my repeated interview requests and questions over the course of working on my book about it, which this story is adapted from; it did not reply when I reached out one more time before the article was published. (OpenAI also has a corporate partnership with The Atlantic.)

OpenAI’s dueling cultures—the ambition to safely develop AGI, and the desire to grow a massive user base through new product launches—would explode toward the end of 2023. Gravely concerned about the direction Altman was taking the company, Sutskever would approach his fellow board of directors, along with his colleague Mira Murati, then OpenAI’s chief technology officer; the board would subsequently conclude the need to push the CEO out. What happened next—with Altman’s ouster and then reinstatement—rocked the tech industry. Yet since then, OpenAI and Sam Altman have become more central to world affairs. Last week, the company unveiled an “OpenAI for Countries” initiative that would allow OpenAI to play a key role in developing AI infrastructure outside of the United States. And Altman has become an ally to the Trump administration, appearing, for example, at an event with Saudi officials this week and onstage with the president in January to announce a $500 billion AI-computing-infrastructure project.

Altman’s brief ouster—and his ability to return and consolidate power—is now crucial history to understand the company’s position at this pivotal moment for the future of AI development. Details have been missing from previous reporting on this incident, including information that sheds light on Sutskever and Murati’s thinking and the response from the rank and file. Here, they are presented for the first time, according to accounts from more than a dozen people who were either directly involved or close to the people directly involved, as well as their contemporaneous notes, plus screenshots of Slack messages, emails, audio recordings, and other corroborating evidence.

The altruistic OpenAI is gone, if it ever existed. What future is the company building now?

Before ChatGPT, sources told me, Altman seemed generally energized. Now he often appeared exhausted. Propelled into megastardom, he was dealing with intensified scrutiny and an overwhelming travel schedule. Meanwhile, Google, Meta, Anthropic, Perplexity, and many others were all developing their own generative-AI products to compete with OpenAI’s chatbot.

Many of Altman’s closest executives had long observed a particular pattern in his behavior: If two teams disagreed, he often agreed in private with each of their perspectives, which created confusion and bred mistrust among colleagues. Now Altman was also frequently bad-mouthing staffers behind their backs while pushing them to deploy products faster and faster. Team leads mirroring his behavior began to pit staff against one another. Sources told me that Greg Brockman, another of OpenAI’s co-founders and its president, added to the problems when he popped into projects and derail­ed long-​standing plans with ­last-​minute changes.

The environment within OpenAI was changing. Previously, Sutskever had tried to unite workers behind a common cause. Among employees, he had been known as a deep thinker and even something of a mystic, regularly speaking in spiritual terms. He wore shirts with animals on them to the office and painted them as well—a cuddly cat, cuddly alpacas, a cuddly fire-breathing dragon. One of his amateur paintings hung in the office, a trio of flowers blossoming in the shape of OpenAI’s logo, a symbol of what he always urged employees to build: “A plurality of humanity-loving AGIs.”

But by the middle of 2023—around the time he began speaking more regularly about the idea of a bunker—Sutskever was no longer just preoccupied by the possible cataclysmic shifts of AGI and superintelligence, according to sources familiar with his thinking. He was consumed by another anxiety: the erosion of his faith that OpenAI could even keep up its technical advancements to reach AGI, or bear that responsibility with Altman as its leader. Sutskever felt Altman’s pattern of behavior was undermining the two pillars of OpenAI’s mission, the sources said: It was slowing down research progress and eroding any chance at making sound AI-safety decisions.

Meanwhile, Murati was trying to manage the mess. She had always played translator and bridge to Altman. If he had adjustments to the company’s strategic direction, she was the implementer. If a team needed to push back against his decisions, she was their champion. When people grew frustrated with their inability to get a straight answer out of Altman, they sought her help. “She was the one getting stuff done,” a former colleague of hers told me. (Murati declined to comment.)

During the development of GPT‑­4, Altman and Brockman’s dynamic had nearly led key people to quit, sources told me. Altman was also seemingly trying to circumvent safety processes for expediency. At one point, sources close to the situation said, he had told Murati that OpenAI’s legal team had cleared the latest model, GPT-4 Turbo, to skip review by the company’s Deployment Safety Board, or DSB—a committee of Microsoft and OpenAI representatives who evaluated whether OpenAI’s most powerful models were ready for release. But when Murati checked in with Jason Kwon, who oversaw the legal team, Kwon had no idea how Altman had gotten that impression.

In the summer, Murati attempted to give Altman detailed feedback on these issues, according to multiple sources. It didn’t work. The CEO iced her out, and it took weeks to thaw the relationship.

By fall, Sutskever and Murati both drew the same conclusion. They separately approached the three board members who were not OpenAI employees—Helen Toner, a director at Georgetown University’s Center for Security and Emerging Technology; the roboticist Tasha McCauley; and one of Quora’s co-founders and its CEO, Adam D’Angelo—and raised concerns about Altman’s leadership. “I don’t think Sam is the guy who should have the finger on the button for AGI,” Sutskever said in one such meeting, according to notes I reviewed. “I don’t feel comfortable about Sam leading us to AGI,” Murati said in another, according to sources familiar with the conversation.

That Sutskever and Murati both felt this way had a huge effect on Toner, McCauley, and D’Angelo. For close to a year, they, too, had been processing their own grave concerns about Altman, according to sources familiar with their thinking. Among their many doubts, the three directors had discovered through a series of chance encounters that he had not been forthcoming with them about a range of issues, from a breach in the DSB’s protocols to the legal structure of OpenAI Startup Fund, a dealmaking vehicle that was meant to be under the company but that instead Altman owned himself.

If two of Altman’s most senior deputies were sounding the alarm on his leadership, the board had a serious problem. Sutskever and Murati were not the first to raise these kinds of issues, either. In total, the three directors had heard similar feedback over the years from at least five other people within one to two levels of Altman, the sources said. By the end of October, Toner, McCauley, and D’Angelo began to meet nearly daily on video calls, agreeing that Sutskever’s and Murati’s feedback about Altman, and Sutskever’s suggestion to fire him, warranted serious deliberation.

As they did so, Sutskever sent them long dossiers of documents and screenshots that he and Murati had gathered in tandem with examples of Altman’s behaviors. The screenshots showed at least two more senior leaders noting Altman’s tendency to skirt around or ignore processes, whether they’d been instituted for AI-safety reasons or to smooth company operations. This included, the directors learned, Altman’s apparent attempt to skip DSB review for GPT-4 Turbo.

By Saturday, November 11, the independent directors had made their decision. As Sutskever suggested, they would remove Altman and install Murati as interim CEO. On November 17, 2023, at about noon Pacific time, Sutskever fired Altman on a Google Meet with the three independent board members. Sutskever then told Brockman on another Google Meet that Brockman would no longer be on the board but would retain his role at the company. A public announcement went out immediately.

For a brief moment, OpenAI’s future was an open question. It might have taken a path away from aggressive commercialization and Altman. But this is not what happened.

After what had seemed like a few hours of calm and stability, including Murati having a productive conversation with Microsoft—at the time OpenAI’s largest financial backer—she had suddenly called the board members with a new problem. Altman and Brockman were telling everyone that Altman’s removal had been a coup by Sutskever, she said.

It hadn’t helped that, during a company all-​hands to address employee questions, Sutskever had been completely ineffectual with his communication.

“Was there a specific incident that led to this?” Murati had read aloud from a list of employee questions, according to a recording I obtained of the meeting.

“Many of the questions in the document will be about the details,” Sutskever responded. “What, when, how, who, exactly. I wish I could go into the details. But I can’t.”

“Are we worried about the hostile takeover via coercive influence of the existing board members?” Sutskever read from another employee later.

“Hostile takeover?” Sutskever repeated, a new edge in his voice. “The OpenAI nonprofit board has acted entirely in accordance to its objective. It is not a hostile takeover. Not at all. I disagree with this question.”

Shortly thereafter, the remaining board, including Sutskever, confronted enraged leadership over a video call. Kwon, the chief strategy officer, and Anna Makanju, the vice president of global affairs, were leading the charge in rejecting the board’s characterization of Altman’s behavior as “not consistently candid,” according to sources present at the meeting. They demanded evidence to support the board’s decision, which the members felt they couldn’t provide without outing Murati, according to sources familiar with their thinking.

In rapid succession that day, Brockman quit in protest, followed by three other senior researchers. Through the evening, employees only got angrier, fueled by compounding problems: among them, a lack of clarity from the board about their reasons for firing Altman; a potential loss of a tender offer, which had given some the option to sell what could amount to millions of dollars’ worth of their equity; and a growing fear that the instability at the company could lead to its unraveling, which would squander so much promise and hard work.

Faced with the possibility of OpenAI falling apart, Sutskever’s resolve immediately started to crack. OpenAI was his baby, his life; its dissolution would destroy him. He began to plead with his fellow board members to reconsider their position on Altman.

Meanwhile, Murati’s interim position was being challenged. The conflagration within the company was also spreading to a growing circle of investors. Murati now was unwilling to explicitly throw her weight behind the board’s decision to fire Altman. Though her feedback had helped instigate it, she had not participated herself in the deliberations.

By Monday morning, the board had lost. Murati and Sutskever flipped sides. Altman would come back; there was no other way to save OpenAI.

I was already working on a book about OpenAI at the time, and in the weeks that followed the board crisis, friends, family, and media would ask me dozens of times: What did all this mean, if anything? To me, the drama highlighted one of the most urgent questions of our generation: How do we govern artificial intelligence? With AI on track to rewire a great many other crucial functions in society, that question is really asking: How do we ensure that we’ll make our future better, not worse?

The events of November 2023 illustrated in the clearest terms just how much a power struggle among a tiny handful of Silicon Valley elites is currently shaping the future of this technology. And the scorecard of this centralized approach to AI development is deeply troubling. OpenAI today has become everything that it said it would not be. It has turned into a nonprofit in name only, aggressively commercializing products such as ChatGPT and seeking historic valuations. It has grown ever more secretive, not only cutting off access to its own research but shifting norms across the industry to no longer share meaningful technical details about AI models. In the pursuit of an amorphous vision of progress, its aggressive push on the limits of scale has rewritten the rules for a new era of AI development. Now every tech giant is racing to out-scale one another, spending sums so astronomical that even they have scrambled to redistribute and consolidate their resources. What was once unprecedented has become the norm.

As a result, these AI companies have never been richer. In March, OpenAI raised $40 billion, the largest private tech-funding round on record, and hit a $300 billion valuation. Anthropic is valued at more than $60 billion. Near the end of last year, the six largest tech giants together had seen their market caps increase by more than $8 trillion after ChatGPT. At the same time, more and more doubts have risen about the true economic value of generative AI, including a growing body of studies that have shown that the technology is not translating into productivity gains for most workers, while it’s also eroding their critical thinking.

In a November Bloomberg article reviewing the generative-AI industry, the staff writers Parmy Olson and Carolyn Silverman summarized it succinctly. The data, they wrote, “raises an uncomfortable prospect: that this supposedly revolutionary technology might never deliver on its promise of broad economic transformation, but instead just concentrate more wealth at the top.”

Meanwhile, it’s not just a lack of productivity gains that many in the rest of the world are facing. The exploding human and material costs are settling onto wide swaths of society, especially the most vulnerable, people I met around the world, whether workers and rural residents in the global North or impoverished communities in the global South, all suffering new degrees of precarity. Workers in Kenya earned abysmal wages to filter out violence and hate speech from OpenAI’s technologies, including ChatGPT. Artists are being replaced by the very AI models that were built from their work without their consent or compensation. The journalism industry is atrophying as generative-AI technologies spawn heightened volumes of misinformation. Before our eyes, we’re seeing an ancient story repeat itself: Like empires of old, the new empires of AI are amassing extraordinary riches across space and time at great expense to everyone else.

To quell the rising concerns about generative AI’s present-day performance, Altman has trumpeted the future benefits of AGI ever louder. In a September 2024 blog post, he declared that the “Intelligence Age,” characterized by “massive prosperity,” would soon be upon us. At this point, AGI is largely rhetorical—a fantastical, all-purpose excuse for OpenAI to continue pushing for ever more wealth and power. Under the guise of a civilizing mission, the empire of AI is accelerating its global expansion and entrenching its power.

As for Sutskever and Murati, both parted ways with OpenAI after what employees now call “The Blip,” joining a long string of leaders who have left the organization after clashing with Altman. Like many of the others who failed to reshape OpenAI, the two did what has become the next-most-popular option: They each set up their own shops, to compete for the future of this technology.


This essay has been adapted from Karen Hao’s forthcoming book, Empire of AI.

Empire Of AI – Dreams And Nightmares In Sam Altman’s OpenAI

By Karen Hao


*Illustration by Akshita Chandra / The Atlantic. Sources: Nathan Howard / Bloomberg / Getty; Jack Guez / AFP / Getty; Jon Kopaloff / Getty; Manuel Augusto Moreno / Getty; Yuichiro Chino / Getty.


​When you buy a book using a link on this page, we receive a commission. Thank you for supporting The Atlantic.

Continue Reading

Trending