<Technical post, probably belongs in linkedin. Posting here regardless>
It's a new world of coding with AI. I have been pair-programming with ChatGPT (and to a lesser extent Google Gemini) for a while now. My emotions run the full gamut and it is quite a roller-coaster :-). One minute I am full of awe and the next minute I am like "I can't believe you're that stupid". I will describe my experience with a couple of projects one big and one small. If you want to TLDR the project descriptions, move to the bottom to the Morals of the Story section.
Mobile App with Backend
I am starting on a project this year which is the first one where we are coding from scratch after the recent AI revolution. I wanted to provide the team that I am leading with some starter code for the backend (Python and the mobile app frontend (Flutter/Dart code).
- Backend data model: I had written a document describing the entities and their fields and their relationship with other entities. I fed this document to ChatGPT and asked it to generate the model class definitions for a model-first Object-Relational-Mapping approach using Flask/SQLAlchemy ORM. This saved a lot time writing code for the classes. ChatGPT did 90% of the work and I could make adjustments to get it to 100%.
- Backend data generation APIs: I gave detailed prompts asking ChatGPT to write code for a REST API to generate the data walking the object model to package all the nested objects in the object hierarchy and return the data as JSON. ChatGPT's code worked quite well here and again the 90%-10% split above worked out.
- Client App code - I had mixed results here with LLMs:
- I asked ChatGPT to write the Flutter/Dart code for the App to consume the JSON sent by the server API. I needed to persist the objects on the client on Hive local storage for offline availability. ChatGPT's first effort only persisted the top-level objects and not the objects embedded within it. When I pointed this out, ChatGPT added the additional code to persist embedded objects. But the code had one mistake that I was able to debug easily and fix. But I still ran into issues with Hive that I could not fix with additional prompts to ChatGPT. I ran into my session limit and ChatGPT asked me to come back after a few hours to continue.
- I then moved to Gemini which did slightly better in some respects but worse than ChatGPT in others. It had some silly mistakes like a missed dependency in the project dependencies YAML file, which wasted some of my time. Gemini's advice to fix the issue was not useful, I ended giving up the advice to Gemini, which it graciously seemed to accept and promised not to make the mistake After these were fixed, somehow the Hive issue tht happened with Hive did not happen and I was able to finish the work. The generated code will essentially will serve as the Proof-of-concept of how our design would work with the backend date model, APIs for generating data in JSON, and client code to consume and persist in local cache.
Text-to-Speech
I was trying to integrate Google's text-to-speech converter to get to speak Indian/Sanskrit words and had to do some coding on top to get it to work. I needed to provide custom pronunciations for the Indian words that the text-to-speech would by default mispronounce. There was some gnarly handling required when the words with custom pronunciations are in possessive form with an aphostrope (say "Durga's") and Google TTS wasn't up to the task without additional coding. And I enlisted ChatGPT's help. V1 of the code didn't quite work. I had to do some debugging on my end and I realized what might be happening and gave ChatGPT a hint; it duly used the hint and tried to fix it with V2. Still didn't work. I now just said, nope, try again. Now it seemed to figure out itself what the problem might be and produced V3. That didn't work either. I gave up on this particular thread after V5 or so.
While I could not get that last thing (handling apostrophes gracefully) working, thanks to AI, I was still able to try out 3 different Text-to-speech solutions (AWS, Google, Elevenlabs) and came up with potentially a reasonable V1 approach figured out.
Morals of the story
- At the time of writing, LLMs like ChatGPT are good enough to generate a lot of the boiler plate code that needs to be written. This can save a lot of time.
- Often the 80-20 rule applies; you get 80% of the lines of code from the LLM and you finish the last 20; it is 90-10 if you are lucky. And if you are really lucky, the code would be bug-free. That gets us to the next point.
- A big caveat is that the code generated is very likely to have bugs and sometime serious gaffes and you need to be able to understand the logic completely so that you can fix the bugs to get it to work; at times, you can keep iterating (telling the AI that it is not working, try again) but you are better off finding the bugs, or at least do some partial troubleshooting so that you can frame your incremental prompts with helpful information to the AI. If you run into a situation where you are not able to figure out the fix, you may start losing the time-saving advantage of getting the boiler-plate code; but then again if you had written the code from scratch, your code may have the same (or higher) number of bugs.
- It is super important to invest in good English documentation (for e.g. the data model) of your high level design so that you can give detailed requirements in English in your prompts. That can server your entire project to generate both server and client code. This might finally be the forcing function to get your developers to write good documents. This is a bit worrisome because in my experience often developers (especially in India) hate to or are not good at (or both) documenting their designs, requirements and algorithms. But this is going to become more important in the new world.
R. Balaji