Code With LLM

CodeWithLLM-Updates

April 2024

https://github.blog/2024-04-29-github-copilot-workspace/GitHub introduces Copilot Workspace in TechPreview - a new AI-based development environment

GitHub Copilot Workspace is an updated development environment where programmers can work with code in plain language 💬. The system can help developers from the very beginning - planning the task ✅, writing specifications 📄, and generating code 💻.
All this can be edited and refined ✏️.

The environment is integrated with GitHub Issues 📥, Pull Requests 🔃, and repositories 🗃.

Developers can run and test code directly in Copilot Workspace 🚀. This allows to simplify and speed up the entire software development cycle ⏱.

For professional developers, Copilot Workspace will allow more focus on system thinking 🧠 and move away from routine work that can be automated ⚙️.

https://youtu.be/eR855VNPjhk

In Groq on their YouTube channel, there is a demo of the project iter on the mixtral-8x7b-32768 model in the terminal on nix-shell. I have not tested it - his approach is to print everything and have minimal control over generation.

1:53 But there is a cool feature, the command reflextion - when the model will receive 6 requests with the instruction to think about the problem.

The video is from March 8, and now there is also llama3-70b-8192 - but 8k versus 32k context window.

For VSCode in the extensions catalog, I found Groqopilot v0.0.81, but now it is more likely not working than working.

https://www.youtube.com/watch?v=5poVsIeq3TM

Snapdragon presented X chips not for phones, but for Windows computers. In addition to the usual and graphics core, they will also have an NPU (neural processing unit). They promise that applications will work faster, and large language models (LLMs) can be run locally and natively without problems.

VSCode at 17:53
with Qualcomm AI Code Completion
with llama-chat-v2-7b

https://ollama.com/library/phi3A new version of the model from MS Phi-3 mini has been added to the ollama catalog.

Despite its relatively small size of 3.8B, this model, according to the developers, demonstrates high efficiency in working with logic and mathematical tasks.

During the training of Phi models, an approach similar to teaching children using "textbook-like" materials in mathematics, logic, and programming was used. It is expected that this method will improve the overall results of the model in these areas.

Also, an advantage of the new model is a version with a large context window - 128k tokens.

A new player has emerged that aims to compete with OpenAI and Anthropic - the company Reka AI.

Currently, they have three models:
🟧 Edge: a lightweight model with 7b parameters.
🟧 Flash: a fast and powerful model with 21b parameters.
🟧 Core: the largest and most powerful model for complex tasks (size unknown).

In non-multimodal tests of code generation from text instructions, it seems that Core is inferior to GPT-4 and Claude-3-Opus.

https://www.youtube.com/watch?v=Smklr44N8QU

🆕 In the latest update of Cursor, Copilot++ has been improved and the "Help" section has been made in the form of an AI chat (currently in beta).

This video review shows how Copilot++ can automatically generate code based on the context of the current project. The process of creating an interactive node graph in the Vue.js framework is demonstrated using hints and commands provided by Copilot++.

The main features of Copilot++ demonstrated in the video:

Autocompletion of code on several lines simultaneously.
Understanding the project context to provide relevant code suggestions.
Ability to add documentation as reference materials for AI.
Flexible control of code generation through text instructions.

The video emphasizes that using AI assistants like Copilot++ allows developers to focus on high-level logic instead of writing individual lines of code. This can significantly increase productivity and remain relevant in the context of rapid technology development.

I wrote it for you :)

AI in Software Development: Current State and Prospects

🥳 Llama 3 is out!

Models were trained on two newly announced clusters with specially built 24K GPUs on over 15T tokens of data - the training dataset is 7 times larger than that used for Llama 2,
including 4 times more code.

https://llama.meta.com/llama3/

models are also deployed on LPU https://groq.com/

Now to the official registry Tabby added models of the CodeGemma series (2b and 7b) and CodeQwen (7b) for both completion and chat.

A set of open models from Google for writing code CodeGemma added to ollama (which is supported by Cody) and catalog.ngc.nvidia 7B and 2B - I did not find any full-fledged reviews and comparisons, perhaps, no one is interested in this.

I also did not find in Google a direct indication of the size of the context window of these models.

Claude 3 is now available to all Cody users 🚀
( blog )

Cody now supports the new Claude 3 family of models from Anthropic, which includes three models: Haiku (the fastest), Opus (the smartest), and Sonnet (intermediate).

These models demonstrate improvements in code generation, the ability to quickly recall information from a large context, and other characteristics important for Cody.

🆓 For Cody Free users, the Sonnet model (4th in the LMSYS Chatbot Arena rating) is now used by default, replacing Claude 2.0 (15th in the LMSYS Chatbot Arena rating).

Cody Pro users can choose between Haiku (8th in the LMSYS Chatbot Arena rating), Sonnet, and Opus (1st in the LMSYS Chatbot Arena rating)

A video from VRSEN about the presentation of Devid - an AI software engineer. In it, the author demonstrates his open implementation of Devin, which has three main advantages: full access to the source code, training on real coding tasks, not just on GitHub Issues, and it is an agent system.

The author shows how Devid creates a website with the game "Game of Life", modifying HTML, CSS, and JavaScript files. Then he demonstrates how to import Devid and other agents into his own project using Docker containers. The author also describes how to set up planner agents, developer agents, and browser agents so that they effectively collaborate to perform tasks.

Finally, the author tests this agent system on a task of benchmarking several APIs, showing how agents can find documentation, execute code, and provide results.

So far, everything works quite mediocrely, although he blames OpenAI for this. If the documentation has many pages, errors occur. It is also not mentioned how many tokens these tasks consumed, it is simply stated that it is more efficient than in Devin.

https://youtu.be/BEpDRj9H3zE

At the Cloud Next conference, Google showed its assistant for programming (on the slide it was a VSCode plugin) - the main emphasis was on the fact that Gemeni 1.5 has a context window that no competitor has - 1M tokens.

Gemini Code Assist is available for testing for free until July 11, 2024.

Video "Why I stopped using Copilot":

🧠 If you don't practice skills, you can lose them. Using Copilot influenced the way I wrote code, prompting me to wait for AI hints instead of using my own brain.

👨‍💻 Writing code became less interesting. Copilot deprived me of the opportunity to learn, be creative, and solve problems on my own, which I enjoyed.

🔍 The quality of Copilot's hints was unstable - often they were outdated or contained errors. I had to check the documentation, which reduced efficiency.

🔒 Privacy is a big problem. Every time I used Copilot, fragments of my code were sent to a remote server, which is unacceptable to me as a supporter of privacy and self-hosting.

https://www.youtube.com/watch?v=Wap2tkgaT1Q

( a little clickbait and only one point of view )

In Phind there are now 5 (10 for users) free requests for 70b per day. Previously, this option was for GPT-4.

There is also a tab called 'ask a question about your code', marked as experimental, and there is no way to test it on the $10 plan, you need to upgrade to $20.

I suspect this should connect the repository, but on the paid plans description page there are no details yet.

Found a startup Coze — a significantly expanded clone of the functionality of GPTs, where

there is no binding of the chat only to the OpenAI website - access to the bot is possible from Discord, Telegram, Slack (business bots), Facebook and Instagram Messenger, LINE (popular in Asia), Reddit (bots for specialized communities), Cici, Lark.
it is possible to choose GPT 3.5, where the limit is 500 messages/day
catalog of already configured plugins where there are GitHub, StackOverFlow, Code Interpreter, Data Analysis
multi-agent mode - explanation in the video
https://www.youtube.com/watch?v=l00ZB2ZaVO0

⌨️ In the bot store I found Code Companion from icheQ - now 11.7K users. There is also access to it from telegram _Minus - this is a startup now without a financial model.__Access only from the USA, the telegram bot answered me after 3 minutes and in general everything works very slowly._Limits__GPT-4 100-50 messages/day.

chat chaos https://t.me/+m7bX9D4WjV4yMzgx

Google released a video from their Gemma Developer Day 2024.

Gemma are open LLMs that can be used locally.

https://youtu.be/zzw2OSFw9xI

Bito is the most similar product to Github Copilot, but with a limited free plan (100 additions per month, 20 chat messages per day).

🤖 The video is about the new Bito AI Code Review Agent, which helps to reduce time on code review by 50% and improve code quality (works only in the $15/month plan, but there is a trial).

🔧 The agent integrates with GitHub and GitLab, automatically performs static code analysis, checks for vulnerabilities, and provides detailed comments with recommendations for code improvement.

In the chats section on Chatbot Arena and in Perplexity playground appeared dbrx-instruct model (github ). I conducted a series of tests with code generation, and indeed the results are worthy. In addition, it is faster than CodeLLaMA -70B.

The developer of the VSCode plugin Doubleadded to GPT-4 Turbo and Claude 3 (Opus) also DBRX Instruct, although it is not very clear why and also opened a GPT-5 waitlist.

DataBricks, a company known for its data processing and analysis solutions, has released one of the most powerful and efficient open LLMs - DBRX. On the graphs that are published in the presentation post, the DBRX model outperforms other open solutions in the fields of mathematics and programming.

This MoE 16x12B multi-expert model (132 billion total parameters - 36 billion active parameters for processing each token), which in many tasks outperforms the open Grok-1 and the closed GPT-3.5 Turbo (but not Claude 3 Haiku). Context window 32k, tokenizer like GPT-4. Knowledge cutoff - December 2023.

They say that according to tests, they outperform CodeLLaMA-70B. The DBRX model is quite large in size so that not everyone can run it, but not as huge as Grok-1, which practically no one can deploy at home now. Meta plans to release Lllama 3 sometime in July.

The chat is also available at https://huggingface.co/spaces/databricks/dbrx-instruct
(5-shoot max)

April 2024