Project Spotlight: Optimise Product Updates/ Creates on pipelines

Challenge - Our platform which houses millions of products were subject to regular updates by Marketplace sellers via APIs or by Kogan.com's internal teams. However, an arduous average update pipeline duration of 8 hours hindered the prompt visibility of these changes to customers on the website or app.

Determined to tackle this challenge, our focus rested on expediting the update process while concurrently reducing infrastructure costs so that our customers can swiftly access the most recent product information.

An analysis of the update pipeline revealed several areas requiring our immediate attention:

  1. Repetitive Update Events - recurrent API calls made by Marketplace sellers resulted in duplicated update events for the same product. This redundancy significantly impeded operational efficiency.
  2. Sluggish Database Queries - inefficiencies in database queries engendered prolonged processing times, undermining the timely dissemination of updated product information.
  3. Architectural Bottlenecks - restrictive architectural elements posed as bottlenecks, impinging on system throughput and curtailing overall performance.

To overcome these hurdles, the following measures were taken:

  1. Enhanced Diffing Logic - by leveraging an opensearch cluster, we revolutionized our diffing logic. This sophisticated cluster facilitated real-time comparison of incoming API payloads. Consequently, if no alterations were detected in the product information, redundant requests were skipped, culminating in a more expeditious system.
  2. Optimized Database Queries - a Thorough optimization of database queries was undertaken, incorporating prefetching techniques and mitigating the prevalence of joins and n+1 queries. This meticulous overhaul addressed the root causes of sluggishness, resulting in expedited processing times.
  3. Introduction of a Datastore - a revolutionary datastore was introduced, specifically designed to house denormalized product data. This datastore formed the bedrock of product information display to customers, effectively reducing dependence on Opensearch and enabling its gradual scaling down. The introduction of this refined infrastructure bolstered system responsiveness and agility.

By implementing these measures, we were able to address inefficiencies on our product updates in a timely manner. Our actions played a big part in providing our customers a more seamless experience on our site! This also enables us to scale our operations so we can support a wider range of products and continue meeting the needs of our customers in the future.

5 Simple Tips For Optimising Query Performance in BigQuery

Optimizing Query performance in BigQuery presents large cost saving opportunities for the Engineering Team as well as a speed boost which benefits the business, stakeholders and developers alike.

Following on from our meeting with the Google team, I was inspired to do some further learning and challenge myself to write a short blog post with my top 5 simple tips for writing better queries.

This is by no means an exhaustive list, or even the best tips. My hope is that this blog post will inspire you (and me) to do further research and practice.

Rather than provide example code which is useful but boring, I've gone with pictures instead which are less useful but also way less boring.

I will also provide a couple links at the end of the post for those who want to learn more.

Tip 1. Avoid SELECT *

OK I admit it, I have been guilty of using SELECT * for massive tables believing that applying the LIMIT clause will reduce bytes processes. I was wrong.

Best Practise

  • Want to explore the data? Use the data preview function instead of SELECT *
  • Query only the columns you need.

    • eg SELECT dim1, metric1 FROM … or
    • SELECT * EXCEPT dim2, metric2 FROM
  • If you do require every column, consider:

    • Materialising results in a destination table and querying that table instead.
    • Partitioning your tables and querying the relevant partition.

Tip 2. Table JOIN Order Matters

The aim here is to reduce the dataset as early as possible. I thought that BQ optimized joins automatically, but apparently this is not always the case.

Best Practise

  • Manually place the largest table first, followed by the smallest, then by decreasing size.
    • You want to reduce your resultset the most at the first join.

Tip 3. Know when to aggregate early and late.

Aggregation is very costly so care needs to be taken.

Best Practise

  • As a general rule aggregate and perform complex operations as late and as seldom as possible.
  • UNLESS a CTE (Common Table Expression) can be reduced drastically by aggregating before joining, then aggregate early! You want to prune data as much as possible before a join.

Tip 4. When using WHERE clauses AND…, put the most selective first.

Again, the point is to reduce the data as early as possible.

Best Practise

  • Order expressions in your WHERE clause with the most selective expression first. (i.e. the one which will eliminate the most rows)
  • Then: Use expensive expressions like LIKE ‘%text%’ only after you have reduced your dataset

Tip 5. Use Approximate Functions where possible.

I had no idea these even existed. If you don’t need precision >99% approximate functions yield faster query performance.

Best Practise

  • Where appropriate, use approximate functions. (e.g. in data exploration)
    • Eg use APPROX_COUNT_DISTINCT(dim1) instead of COUNT (DISTINCT dim1)

Tip 5. Use Approximate Functions where possible.

I had no idea these even existed. If you don’t need precision >99% approximate functions yield faster query performance.

Best Practise

  • Where appropriate, use approximate functions. (e.g. in data exploration)
    • Eg use APPROX_COUNT_DISTINCT(dim1) instead of COUNT (DISTINCT dim1)

Hack Day - Chat (Khat) GPT

Why did the AI buying assistant start a stand-up comedy career? Because it knows how to find the best "deals" of laughter!

Here is hoping ChatGPT is better at being a buying assistant than it is at writing jokes!

ChatGPT is the hottest topic around the industry and, inspired by a recent tech talk by our CTO, we used our most recent hack day to try and answer some questions about what Kogan.com could use ChatGPT for. Can we train ChatGPT with knowledge of our products so that it can be a buying assistant to our customers? Can we train it with our FAQs and customer support information so that it can help answer customer queries more efficiently? Through a few different experiments, we tried to answer these questions.

So what is ChatGPT?

ChatGPT is an advanced artificial intelligence language model developed by OpenAI that specialises in understanding and generating human-like text. It works by analysing large amounts of text data and predicting the most likely responses or content based on the context provided, leading to highly accurate and coherent interactions.

Where to begin?

The ChatGPT team on our hack day was the largest and involved a good mixture of different types of developers. We split into two main segments - our backend developers focused on training the model with our data and creating an API that could be interacted with. Our mobile developers focused on adding a chat feature to our current app, which would interact with the created API.

Implementation

Training the model

Starting out we investigated which language model would be the best fit for our application. Well-known models like GPT 3.5/4 are best suited for precise answers whereas models like Davinci are better at extended, concise responses. As this was a support-type bot with a general audience, we went with Davinci for a better user experience.

The process of training the bot involved creating a GPT vector index containing all our data using the OpenAI language model we’ve selected. We used the Python lib, LlamaIndex to facilitate this process. We were then able to query our GPT vector index with our questions.

We trained one of the models using product descriptions and product manuals and the second included customer support FAQs and a collection of fake customer support tickets to see if we could train positive interactions into the model.

So, with our models in hand, we jumped in with some questions!!

Immediately we found that general information from our customer interactions with the support team, such as customer name, was being used by the model to assume the user’s name etc. Questions like,

“How can I get a refund on my order”,

Would often come back with responses like,

“Sure , to receive a refund from Kogan…”

The security/privacy implications of a model being trained with conversations were becoming obvious and seemed like a bigger problem for another day, so we had to park the customer support branch of this project for the day.

We focused on the product assistant for the rest of the day, which turned out to be a great decision as we got some great results.

API Setup

In addition to building the model, we also needed to host it somewhere to make it available for the mobile app to consume. Initially, we started building out a lambda function behind an API gateway in AWS to host the application, but in the final hour of our hack day realised that the model wouldn’t fit into the lambda function’s available space, so we quickly pivoted by spinning up an EC2 instance using AWS Lightsail and setting up Flask as our API. It was great to see the team mitigate issues as they popped up like this.

Solution

At the end of the day, the team was able to showcase a working prototype, with our mobile app integrating with the API and returning some interesting responses. The most promising response we got was a question related to safety information contained within one of our product guides.

Q: “Can I use the Estelle Hair dryer in the bathroom?”

A: “No, you should not use the Estelle Hair Dryer in the bathroom. According to the safety warnings, it should not be used in the bath or shower, near a bathtub, washbasin or other products that contain water, and should be unplugged after use and the speed and heat settings should be set as required before use if it is being used in the bathroom.”

Another example of a general question was:

Q: “What options are available in the Kogan washing machine?”

A: ” The Kogan 7kg Top Load Washer offers a range of options, including 7kg clothes capacity, 6 wash programs and memory function, 8 water levels and 9 types of water power, LED display and delay start function, fuzzy logic control and detergent dissolve tech, child safety lock feature, and a drain pump filter to avoid blockages. It is made from high-quality, clear ABS material, making it easy to maintain and durable for everyday use. It also features personal preference settings, allowing you to customise your washing experience.”

At times, the model wouldn’t align the correct information with a given product but this was always good for a laugh.

Q: “Which Kogan toaster would best suit my family of 6”

A: “The Kogan 4 Slice Toaster (Silver) would best suit a family of 6. It features two independent browning controls, seven adjustable browning settings, built-in Defrost and Reheat functions, an ultra-sleek silver design, a weight capacity of 50kgs, and a frame material of wood. It also has a removable crumb tray at the bottom for easy cleaning.”

To improve this we could have provided our data in a more structured format, but for this experiment all our data was unformatted.

One of the main problems we discovered was the time it took to get a response from the API. Responding often took 5-10 seconds to come through and would come through as complete responses rather than each word like the ChatGPT client in your browser does. This was just a limitation of the library implementation and setting, so could be mitigated with a more bespoke solution.

As an added bonus, our mobile developers also took the time to work with SwiftUI, with one even using ChatGPT to assist with this along the way, and we were all impressed at what it was responding with.

Final Thoughts

Overall it was definitely a successful hack day for the ChatGPT team and highlighted the immense potential of the use cases of ChatGPT at Kogan.com. The hack days are a great opportunity to collaborate with other squads within the engineering team which we wouldn’t usually interact with on a daily basis.

Kogan.com Engineering Growth Paths

Committed to learning and continuous improvement, Kogan.com’s Engineering team develops its engineering talent through giving and taking responsibility, co-creation, mentorship, and the opportunity to experience internal secondments to grow skill sets. As our Engineering teams continue to expand, we’ve created opportunities for new leaders and provided stretch roles that align with our engineers individual goals.

There are opportunities for Engineers at Kogan.com regardless of background. Some engineers at Kogan.com are Individual Contributors, Tech Leads or People Managers – and the growth path is up to them. Read on to get to know Akshay, Ganesh and Adam who all share their engineering journeys and advice for other Engineers who are trying to expand their careers.

Tech Lead & People Leader - Akshay Wadhanave

Q: You grew from an Individual Contributor(IC) to a Tech Lead through a secondment opportunity that became available. What led you to taking up the opportunity and how were you supported through this journey?

The opening interested me as I had really enjoyed the team and technical challenges I was working on and wanted to see how else I could grow my career within the company.

I also had previous experience as Tech Lead so wanted to apply it here.

Through this secondment, I was given exposure to different parts of our e-commerce platform and the technical challenges faced by the new team I was joining.

Part of my transition to the new role, I received regular coaching from our Head of Engineering & CTO. It was great to be exposed to Senior leaders within the team and to learn from their experience. I felt they were really invested in my progress. To gain more targeted leadership coaching, the company also enrolled me into a Tech Lead course where I learnt from an expert in our industry. I also had the opportunity to learn with Tech Leads from other companies and share our different perspectives.

Q: What do you do in your day-to-day at work?

My top priority is keeping engineers on my team unblocked to do their best work. This involves working with our stakeholders and product teams to ensure we’re continuously working on the right things and that we're delivering on time.

We’re always working on technical improvements and I like driving these conversations with my team and listening to their ideas.

Despite being in a leadership position, I still enjoy the hands-on parts of my role. I believe this keeps my technical skills sharp.

Q: What advice can you share to anyone considering a similar journey?

My advice to someone considering how they can make a transition to a role like a Tech Lead is to put your hand up and be brave to take on the challenge.

If you know the environment you’re working in nurtures and supports (like I have at Kogan.com), don’t doubt yourself and just do it!

Tech Lead & People Leader - Ganesh Kawade

Q: You joined Kogan.com as a Tech Lead. Your most recent role before joining us was as an Individual Contributor(IC). What helped you gain trust and influence within your team and how were you supported through this journey?

When I joined Kogan.com as a Tech lead, I was completely new to the domain and tech stack. The team here is already so diverse in tech stacks and skills that I felt I fit right in. I was supported by my awesome team from day 1.

What helped me gain trust and influence in the team is mostly open and transparent communication. The team that I lead consists of individuals with a very high technical skill set and all have a passion for doing things the right way.

I mostly gained their influence by helping team members work to their full potential with minimal distractions. I guide and provide advice to them but equally take a step back so team members can come up with their own solutions.

Q: What do you do in your day-to-day at work?

I help my team build cool things. My day to day is around helping my team deliver some amazing features across our product suites. I also work with them to solve challenging technical problems at scale.

Q: What advice can you share to anyone considering a similar journey? Just go for it, you will learn on the way, so don’t hold back.

Senior Software Engineer - Adam Slomoi

Q: You joined Kogan.com as a Software Engineer and within a short period progressed to a Senior Software Engineer - Individual Contributor(IC). What did you do to achieve this and how were you supported through this journey?

I try to ask questions and learn from those who have more experience than I do, and similarly help others where I can. We have lots of things going on in the engineering team and I’ve had the opportunity to lead some initiatives.

Q: What do you do in your day-to-day at work?

We’ll generally discuss the cards we’re planning on doing for the day. I’ll either start coding myself or join up with someone in the team and we’ll do some pair programming. I’ll then review someone else’s card, mine will get reviewed and then we’ll close out anything raised in QA. Weaved throughout this are lots of architectural discussions, system monitoring and upskilling.

Q: What advice can you share to anyone considering a similar pathway to you?

Enjoy it and keep learning throughout. The technology landscape is massive and constantly evolving so it’s important to keep learning new things and stay up-to-date.

Come grow with the Engineering Team at Kogan.com—we’re hiring!

Interested in taking the leap into leadership or finding ways to expand your technical impact? We’re always looking for experienced engineers to join our growing team!

Improving our Android build performance 📈

Here at the Kogan.com Engineering team, we care about developer experience: everytime an engineer makes any code changes, they have to go through the Edit-Compile-Run cycle.

Every day we go through this cycle hundreds of times, so any degradation of the build time would have a big impact on our team’s overall productivity.

In order to improve the build time, in this blog post, we’ll look closely at the build process for the Android team and identify the areas we could improve. Let’s go.

Analyze a gradle build

Before we continue, we need to clarify where we want to start optimizing our build. There are always clean builds and incremental builds. Incremental builds vary from case to case so we will start looking at the clean build scenario and hopefully any optimization going there would contribute to reducing the build time for the incremental builds.

Test machine

Dell XPS 15 9550

CPU: i7-6820HQ

Memory: 32G

JDK: OpenJDK 11.0.16.1

Build time measurement methodology

To measure the build time more accurately, we have to minimize other applications’ like Android Studio’s impact on the performance. To do that, we ran this command in the command line and left just the command line window open:

./gradlew clean assembleAuDebug --scan

Whenever we start a new experiment, we make sure no JVM is alive (to avoid the last experiment polluting the current one). To do this, we ran this command before each experiment:

./gradlew -stop

During each experiment, we ran the clean and build command 6 times consecutively, without stopping the Gradle daemon. This is to get a more accurate measurement due to JVM’s warming up capability.

Initial result

We could see that the build time fractures a bit, however we can say for sure: it takes more than over 240 seconds (~4 minutes) for a clean build.

First analysis

Let’s start by taking a look at the build scan Gradle provides:

This table shows you the time each gradle task takes to finish. We can see clearly that most of the time was spent on compiling Kotlin code and dex - which are very standard Android processes. 

While the standard processes are hard to optimize right away, we can look at something that isn’t standard. If you look carefully, you can see there is an outlier here: kogan:tranformClassesWithNewrelicTransformForAuDebug

It takes about 23s to finish, ouch, let’s see if we can optimize it.


Experiment One: Newrelic

Newrelic is a product that monitors the application performance. We don’t think it’s useful to have it turned on for debug builds. After looking around the official documentation, we found a way disable it for debug builds:

app/build.gradle

newrelic {
  // do not instrument these build variants
  variantExclusionList = ["debug"]Result:

RESULT:

Conclusion

It’s a huge improvement by simply disabling Newrelic for debug builds. Interestingly it also stabilizes the build time, barring the 5th trial. Nonetheless, it’s such a simple solution that improves the build time by at least 25%! We could have stopped here but it seems like there are more low hanging fruits we could go for.

We’ll therefore enable this optimization for future experiments.

Experiment Two: Latest Kotlin

Kotlin is an evolving language. Every new version brings performance improvements. Let’s see if we can improve any further by simply bumping up the Kotlin version. We bumped the version from 1.5.10 to 1.7.0. 

Result

Conclusion

Even though updating Kotlin gave us only a slight advantage, we’ll still keep the Kotlin update in our next experiments because we love to keep the latest tools available to us!

Experiment Three: Latest JDK

Contrary to common beliefs, Java has a rapidly evolving ecosystem. As of now, the latest long term supported JDK is 17. Let’s see if Java fulfills its backward compatible promise and lets us build the project without any changes to the code.

Conclusion

The result is mostly promising, it gave us some edges on the 3rd, 4th and 6th trial. With the 6th trial it almost halved the time. Not bad at all.

We’ll keep using the JDK 17 for future experiments!

Experiment Four: KSP

Looking again into the latest build analysis. This analysis was taken from one build from the last experiment.

I love the whole section of KSP. Very informative and learnt quite a few things under the hood.

It’s clear that we have to look closer at Kotlin related tasks now.

You can see the familiar compilation and dex phrase across the tasks. If your eyes are sharp enough you can see the not-so-familiar kapt related tasks.

Kapt: annotation processing in Kotlin land

What is kapt? Well, in our opinion, out of all modules in the Kotlin eco-system, kapt must be the most hacky one in there: it was invented to run Java annotation processors on Kotlin code. If you are not familiar with annotation processing, it’s a piece of code that runs in compile time, usually to generate new code into the compilation process. It’s widely used in libraries like Dagger, Room, etc.

In Java land, the annotation processors run inside the javac process. However, Kotlin has its own compiler: kotlinc. The Kotlin compiler could have implemented the annotation processing (JSR-269) from scratch, but in the end Jetbrains people chose not to reimplement the wheel. Instead, the Kotlin compiler would generate dumb Java classes (stubs) from the Kotlin files, then use javac to run annotation processors on these stubs. Jetbrain has published an article explaining how kapt works: https://blog.jetbrains.com/kotlin/2015/06/better-annotation-processing-supporting-stubs-in-kapt/.

It seems to us that these processes are pretty wasteful: even without reading into the details you would have guessed the amount of work the compiler has to do, given the size of the codebase.

Is there any alternative to kapt? Indeed there is…

Entering ksp: the modern symbol processor for Kotlin

It turns out Google has developed a new tool called KSP that hooks directly into the Kotlin compiling process. This native Kotlin solution allows the code generator to run more efficiently without going through the hoops of the Java annotation processors. Details can be found here: https://kotlinlang.org/docs/ksp-why-ksp.html.

While it’s a sound solution, it comes with a major drawback: it’s not compatible with the existing Java annotation processing libraries. The reason is obvious: it’s a native Kotlin plugin that only handles Kotlin code. This drawback means that for every Java annotation processor library out there, they have to re-implement it in KSP. This official page lists the support status of commonly used libraries. The most famous libraries on the unsupported list as of today are probably Dagger and Hilt: the widely used dependency injection framework for Android. So if your project is using one of these two, you are out of luck, and may stop here.

Luckily for us, we don’t have any major roadblock to migrate to KSP. The frameworks/libraries that use annotation processors for our projects are:

●      Moshi codegen

●      Android databinding

Moshi codegen supports KSP so there’s no problem there. Now with data binding: it doesn’t support KSP and we don’t think it will ever do.

Android data binding: demon or a blessing?

When data binding was out, it was such a breath of fresh air: Android land has long suffered from the lack of a solution for binding UI. With databinding, everything binds automatically, lifecycles handled smoothly. However, we quickly discovered the drawbacks of this framework.

The ability to write full Java code in the XML is bad

As Android devs, we all learn about architectures like MVP, MVVM: one of the core ideas is to separate business/UI logic out of the View layer. Take MVVM as an example, you would have your UI logic in the ViewModel: when to show/hide a text, etc. Data binding is an excellent tool to reduce the boilerplate in this regard: you tell the View to bind to a property from the ViewModel then the data binding will update the view whenever the property value changes. It also works for the reverse: for any event from the View, you can instruct databinding to notify the ViewModel.

So far so good, until you learn that you can write the binding expression in full Java syntax, you start seeing code like this in the codebase:

<TextView
   android:id="@+id/shown_price"
   style="@style/BigPrice"
   android:layout_width="wrap_content"
   android:layout_height="wrap_content"
   android:text="@{state.shownPrice}"
   tools:text="$420.00"
   android:visibility="@{state.shownPrice != null}"
   app:layout_constraintTop_toBottomOf="@id/kf_logo"
   app:layout_constraintStart_toStartOf="parent" />

It looks so innocent on the surface. But the simple line of

android:visibility="@{state.shownPrice != null}"

leaks UI logic or even business logic into the view layer: the view now decides when to show itself. What if we would like to only show the price if the price is 0, not null? That surely is a business decision and it belongs in the ViewModel. The better version is to create another property in the state called shownPriceVisibility and bind it directly here, then have the ViewModel to control this property.

One may argue that it’s not databinding’s fault to provide the ability to write full expression, at the end of the day, it’s just a tool, the users of the tool should take their own responsibility. It’s true to some extent, however, data binding provides such an easy way to manipulate the View, you’re more likely to write the seemingly small but harmful things like this. One thing we learn from the past is that, in the long run, a tool that makes it hard to misuse is miles better than the tool that does everything easily, including shooting yourself in the foot.[1] 

Spotty IDE support

Have you ever had the problem that Android Studio tells you all the code referencing the XML binding is wrong but when you compile the project nothing is wrong? Then it comes with rebuilding, restarting the IDE, clearing the IDE cache and enduring the dreadful wait just to find out it doesn’t work, then all of sudden it works again. You are not alone. The team at Google and Jetbrains worked incredibly hard to fix this and you seldom see them today. It’s no one’s fault this doesn’t work seamlessly because the code generation here is so complicated: essentially you try to reference code that hasn’t been generated yet.

With databinding’s support for customized binding, it’s difficult to reason about the code: you would need to jump between Kotlin and XML files to find out where a particular binding method is written.

At last, it uses annotation processing

Needless to say, with Kotlin as our primary programming language, the use of any annotation processing library is going to slow us down.

Remove data binding

We had explained why we didn’t like data binding, in the end, our team decided to go ahead with removing data binding from our codebase. However, there’s one part of data binding we want to keep: view binding.

View binding to the rescue

If you haven’t used it before, view binding generates holder classes in Java for each XML, they expose the views as properties in the holder classes that allow you to access them directly from code. View binding used to be part of the data binding, but engineers at Google have extracted it out as a separate Gradle task. If you only enable view binding, it doesn’t involve the annotation processing as it merely transforms XML to Java, hence no need to run within the Java compilation process. This dedicated gradle task enables gradle to optimize the build process: in theory gradle can look at the input and output and figure out what has changed and what can be cached. This is not easily doable with annotation processing.

Anyway, with the help of view binding, we can transform the existing data binding XML from:

<TextView
            android:id="@+id/product_name_tv"
            style="?textAppearanceTitleSmall"
            android:layout_width="0dp"
            android:layout_height="wrap_content"
            android:layout_marginEnd="16dp"
            android:text="@{viewModel.title}"
            app:layout_constraintEnd_toStartOf="@+id/chevron"
            app:layout_constraintStart_toEndOf="@id/iv_item_image"
            app:layout_constraintTop_toTopOf="@id/iv_item_image" />

To:

XML:

<TextView
            android:id="@+id/product_name_tv"
            style="?textAppearanceTitleSmall"
            android:layout_width="0dp"
            android:layout_height="wrap_content"
            android:layout_marginEnd="16dp"
            app:layout_constraintEnd_toStartOf="@+id/chevron"
            app:layout_constraintStart_toEndOf="@id/iv_item_image"
            app:layout_constraintTop_toTopOf="@id/iv_item_image" />

Kotlin:

val binding = ProductItemViewBinding.inflate(inflater, parent, false)
viewModel.title.observe(lifecycleOwner) { title ->
    binding.productNameTv.text = title
}

It’s a bit more code to write but it’s easy to see what is going on there than the magical binding expression.

Byproduct: replacing the deprecated Kotlin Android Extension

When we were replacing data binding with view binding, we also got the chance to remove the Kotlin synthetics from Kotlin Android Extension plugin. This plugin adds extension methods to Activity, Fragments and Views to help you eliminate the calls to findViewById, it’s a plausible concept at the first glance, then you realize what it does under the hood is a dirty hack at best in regards to Android’s lifecycle, there’s little surprise it got deprecated over the years. Unfortunately our project made heavy use of it. With the chance to use view binding, we manage to remove all them.

You can refer to the official documentation to find out how to do it: https://developer.android.com/topic/libraries/view-binding/migration.

Result

After removing data binding, we are able to migrate our code to use KSP instead. With kapt gone, Kotlin Android Extension gone and ksp enabled, we did another measurement:

Conclusion

We can see that the build time has been reduced reliably. It stabilized at around 100s, equating to 1 min 40 seconds, much better than the baseline result at around 3-4mins.

Finally…

As shown in the graph and our experiment results, we could see these measures contribute greatly to the outcome:

●      Disable newrelic plugin

●      Update to latest JDK

●      Replace KAPT with KSP

With these efforts, we have reduced the full build time from 4-6 minutes to 1-2 minutes! One thing missing in this article is that we didn’t get to measure the improvement to the incremental build time. It’s very difficult to quantify the improvement as there are a lot of variants in incremental build scenarios. However, since the overall performance has improved, we expect a similar ratio of improvement applies to the incremental build.

That’s all for this blog post and we would be happy to hear from you too about ways of getting your build time down.

Hack Day Image search

What if we could just upload an image of what we want to buy into our favourite shopping website's search bar? Wouldn't that make life about 100x easier?

These are questions that my team and I attempted to answer during our last hack day. We are always looking into ways of making the online shopping experience much more efficient for our users, and what better way to implement that than to make use of artificial intelligence? With new technologies being introduced into the image recognition space every day, we thought we should start laying some groundwork for something that could be really useful to us in the near future.

Research

We started out by exploring available options that are quick and out-of-the-box instead of building our own model. One idea that came up during our research was to use an available pre-trained model. What is a Pre-trained model you ask? Pre-trained models are an ML modes that are trained through millions of data, and readily available to use. For our purposes, we ended up with a pre-trained model from ImageNet called VGG-16. VGG-16 is a convolutional neural network that is 16 layers deep. You can load a pre-trained version of the network trained on more than a million images from the ImageNet database. The pre-trained network can classify images into 1000 object categories, such as keyboard, mouse, pencil, and many animals. This model should get us started on the basic image labeling engine. What we need to extend from this is that we need to have an image uploaded and process it on the fly.

Implementation

Our work will be divided into two main parts: Image labeling app. The response of this app will be the label of an image that the user uploads. Modify existing Kogan.com website to accept an image as search and to have all the integration with the image labeling server

We divided our team into two smaller groups, one team worked on the former and another one on the later.

For the image labeling application, we chose to do it in Flask given the quick setup. We set up a Flask app with a single API endpoint to upload an image and give the response of the list of labels for the uploaded image. The backend engine is based on the VGG-16 model to do the image labeling.

The first part is the usual image upload processing in Flask, which is pretty common. Once we saved the image, we passed on the image path to VGGModel object to do the prediction

from tensorflow.keras.applications import (vgg16) from tensorflow.keras.preprocessing.image import load_img from tensorflow.keras.preprocessing.image import img_to_array from tensorflow.keras.applications.imagenet_utils import decode_predictions import numpy as np

class VGGModel: def init(self): self.model = vgg16.VGG16(weights='imagenet')

def get_predictions(self, image_path):
    prepro_image = self.prepro_image(image_path)
    predictions = self.predict(prepro_image)

    labels = [label.replace('_', '+') for _, label, _ in predictions[0]]
    return labels

def prepro_image(self, image):
    original = load_img(image, target_size=(224, 224))
    numpy_image = img_to_array(original)
    image_batch = np.expand_dims(numpy_image, axis=0)
    processed_image = vgg16.preprocess_input(image_batch.copy())
    return processed_image

def predict(self, processed_image):
    predictions = self.model.predict(processed_image)
    label_vgg = decode_predictions(predictions)
    return label_vgg

The predict function will return a list of labels in string that will return a set of labels, sorted from the most accurate to the least one. Once we’ve done this basic application, we extend our work in this part by dockerise the whole server so it can be sent over to the other team to make the integration.

The next part is to modify our site to accept image upload, send the image to be uploaded to the image labeling server, and process the list of labels. We didn’t really do anything special here, as a mock up version of the whole thing, we only have a button without any styling that is able to read files and upload it on click. It was an implementation of a simple form with a single button that automatically submits when a file is selected . While it is necessary to upload it first to our main site backend, for the sake of simplicity, we directly upload it to the image labeling server to directly get the result.

The result we got was quite good given this is the model that is available as open source and pre-trained in generic images. It can correctly label an image of shoes, shopping cart, and buckets. On the other hand, it also gives weird results for more complex images, such as fruits, people, and other more specific images.

What we learned and how we could improve

Given time constraints and limitations on our expertise in Machine Learning, we think this can be a good start for an image search feature on an ecommerce website. There are many considerations before rolling this out into production, such as priority, how to detect branding or more precise image labeling. Branding is an important one, for example, the ability to detect “Nike shoes” instead of generic ones like “shoes” or “running shoes” could hugely benefit the customer using it. There is also further enhancement of this model where we can fine tune the model to adjust to the above needs.

Another solution is to use a managed service available in the cloud such as Google’s to do this; however, it will have the same result as we haven't really fine tuned anything yet, so we passed on this option as it will take some time to get it working and get the GCP account sorted out. Nevertheless, this shows that basic working of image search is not something that requires huge time and effort investment as one might perceive.

Final words

As with the previous ones, this hack day allowed us to explore topics that could be useful to the further development of the website. It’s always interesting to discover new technologies and adapt them to our current codebase, as well as to collaborate with different people within the Engineering team to make our ideas come to life. It was also really nice to see what the other teams came up with, and the presentations at the end of the day showed how creative and skilled our engineers really are!

Django Admin and Deleting Related Objects

What's wrong with the UX in this image?

What's wrong with this image?

There are 2 X buttons which do completely different things. Reading left to right, the first one will set the warehouse to null, and the second one will delete “Australia” from the database without a warning!

We set out to fix this as we decided that we don't want to either:

  • Delete a record without a warning
  • Delete a related object from another admin screen

Removing this element on individual widgets can be achieved with setting can_delete_related=False, but how can it be done across the board?

Django templates are very extensible and customisable. We can force can_delete_related=False at render by extending the related_widget_wrapper template:

{% extends "admin/widgets/related_widget_wrapper.html" %}
{% block links %}
   {% with can_delete_related=False %}
       {{ block.super }}
   {% endwith %}
{% endblock %}

Now the 2nd X is gone and our users won't be confused!

Widgeting around Kogan.com Hack Day!

There’s never any rivalry between Mobile developers vs Web developers, right? Well, the intrigue of what happens in the (more exciting) mobile development space piqued the interest of our frontend and backend mates, so a few joined the Kogan Apps team along with our talented UX designer to tackle a project part of a recent Hack Day.

An idea was pitched to develop a Kogan widget for both iOS and Android devices. On any given day, we have multiple deals going on (see below for a screenshot from our website) but we narrowed it down to showcasing our popular ‘Kogan Daily Deal’ - complete with its own countdown timer to firstly, generate excitement and the sense that time is running out and secondly (and importantly in the world of eComm) to induce FOMO to prompt our users to purchase!

Divide and conquer! (How we started)

Consistent with Agile methodology, we created a backlog of tasks and divided them amongst our teammates. The web developers tackled the API, the mobile developers started their widgeting and UI/UX designed away. 

We did manage to squeeze in a quick 101 for the web devs about the wonderful worlds of SwiftUI and JetPack Compose. WD1 (web developer one) tried to relate with how Angular, Vue or React works for the frontend, which is understandable and WD2 (web developer two) amused themselves with the discussion. It did come up later in conversation that WD2 had been a Mobile developer in the past and worked with Objective-C. Despite showing WD1 & 2 the cooler things that can be achieved with SwiftUI and Jetpack Compose, they remained unconverted - but hey, that’s their loss! 

Henceforth, we will relay our learnings and achievements as conquerings!

Conquering #1

For a Mobile developer (MD) that creates the UI and writes code in either Swift/Objective-C or Kotlin/Java, they understand the pains of working with an Imperative Programming Language. However a WD or a frontend developer is blissfully unaware of these issues. They work with FlexBox and CSS, whereas a Mobile or Desktop developer has to work with making the UI as pixel perfect as the designer conceptualized. With a Declarative Programming paradigm like SwiftUI or Jetpack Compose, it makes it easier for our MDs to create the UI, similar to the ease of our teammates using FlexBox and CSS.

When the designs were completed, the next task was to compose the UI using Declarative. As we were waiting for WD1&2 to finish the API that would retrieve the Daily Deal data, static images were used as a placeholder. 

Conquering #2

There was no API in the Kogan-verse to consume the Daily Deal data, so a handy bit of reverse engineering helped WD1 locate the data source such that they were able to create a wrapper to provide the necessary API endpoint that would return Daily Deal structure.

Matching the UI designs accurately on the Widget using SwiftUI was difficult! Where SwiftUI allows for composing new components quickly, it also has its fair share of quirks because the layout gave us some grief! We did end up getting the images to load, but getting the timer to tick away per the three boxed component design proved too challenging to hack on the one day.

Conquered

We completed the widget family for various sizes per the screenshots below. SwiftUI allows for previewing multiple views simultaneously. 

Conquering #3

With SwiftUI or any framework, if we embed data into a scrollView, we have an instant list of multiple items that can be scrolled and viewed. We spent time trying to make scrollView work, only to realize that it crashes the code and SwiftUI does not provide a scrollView to a widget. So the List of Hot Deals had to be limited from a scrollable list to a list of 3 items.

Conquering #4

Making a lot of formatting functions and styles only highlighted that the designs could be improved. While we were working with a declarative way to create the components, having sizes that were fixed ranging from 11 to 22 points was not the best idea. In hindsight, we could have followed the Apple methodology to declare the styles, just like in CSS, and then apply a Heading, Body, type tag etc to an element rather than adhere to strict font sizes. 

Conquering #5

In the designs there were places where the Price did not have decimals and places where the decimal place digits had a smaller font. This got complicated since the textbox that hosted “$479.75” needed to be split into the three components, “$”, “479” and “.75”, then they needed to be formatted accordingly.  So similar to the TimerView above, and to make it customisable, we put a flag in to conditionally display the decimal points to work for both “$479.75” and “$599” type designs. Passing the font sizes and getting the decimal part of numbers is quite painful when dealing with computers storing numbers in 4 bytes. 0.99 would become 0.9900000000000000001 or 1.00 would become 1 and not have any trailing zeros. Our Javascript friends could use strings to take the characters around the decimal point. However we used Swift NumberFormatter to provide a minimum and maximum of 2 decimal places to ensure consistency, and used the suffix command to get the last 2 digits.

Finish

Our conquering process to develop a widget was an interesting one. Experimenting with the quirks of the language and technology enabled us to highlight learnings that were valuable and applicable to our day to day development work as Mobile engineers. And working alongside our WD friends harmoniously to achieve our goals, while showcasing to them the wonderful world of app development, is always a bonus. Onwards and upwards as we look forward to conquering the next hack day!