Performance decreasing as complexity/size increases

Could be a mix of things. There’s your connection sending request to server (when loading the page), the server to find the content, your connection for receiving the data, the CPU and memory to store it and lay it out, calculate the HTML for all the elements. Last few of these scale when you have more fields and more entities to load.

So far we have 6 655 entities, with relations and a few formulas here and there, some that are complex.

I’ve tested against other Workspace I have where the total of entities in all spaces are less than 100.
I only did quick tested creating entities in a table view, but what I found was this: the more fields you have showing, the longer it takes. If you hide them, it becomes more snappy.

Furthermore, I also tested having a filter showing entities created today (which was 0) against no filter where it shows some 1 600 entities, but only one field showing. I didn’t notice it being slower, but this may vary depending on the view used.

Formulas, lookups, and relationships might take a bit to load, could be various factors here too, like some formulas require you to iterate over all entities or an index, so that can also scale more or less with count of entities. Thankfully in my case I am not dependant on these being snappy and readily available.

Honestly, I think the biggest reason for sluggishness in most cases is because of HTML render cycle and client-side JavaScript, and you can likely remedy this by hiding fields.

That said we have no idea how long it takes for you or how sluggish it is, for all we know it might be the case for you that it takes an abnormal amount of time :thinking:

1 Like

Let me provide some info/hints here.

  1. Fibery can handle large data volumes, we have accounts with 300K records. It works relatively fast for up to 10K records in a single table, and it works “bearably” for 30K+ records in a table. Larger datasets will take time to load, so initial table load may be slow.
  2. It all depends on data complexity. If you have many relations and want to show them all in a table view, it becomes slower.
  3. Formulas and Lookups do not affect performance, since they are calculated async and results are stored in a database. So Views performance is not affected.
  4. Most of the slowness is UI. We recommend to use Filters to show less data and do not add unnecessary fields into Views. What kind of Workstation you have? Fibery is quite CPU-demanding for some Views.
  5. Reports works OK for up to 30K tables, then you may wait 2-5 mins to load the data.

It would be very helpful to see the video of your usage patterns and see what you perceive as “slow”, since it might be normal or not. If not, we would like to dig into details and find out what is going on. If the data is private, you may contact us via Intercom and share the video privately.

3 Likes

Thanks! I’d be happy to put together a video if no obvious solution appears! The rendering and JS execution sounds like a likely culprit: the slowing down definitely correlates with increasing numbers of fields: particularly lookups. You mentioned hiding fields: I’d be delighted to do that: most of the fields I’ve added are only required in limited views (as a ‘field’ here, for a filter there, etc.), and yet the ‘node’ views have become quite cumbersome (and take the longest to load): but I haven’t figured out how to hide fields: would you mind pointing me to the relevant docs? (I do know how to hide columns in the Board view, although I don’t know how to unhide them again!)

1 Like

Hmm, by default when you create a View you see just Name. Can you post a screenshot of a view where you see all the fields? Maybe you are using Database view mainly, default in the Space?

1 Like

The views that are the slowest are the views I get when I click on “Open” to the left of one row of a Database. I would love to be able to tune those views to hide (most of) the fields, and I imagine that would likely fix my performance issue. I think I need to use intercom to send you the screenshot: how do I initiate that? (Or I can email or slack or discord it to you).

Just click on the Chat button and paste a screen shot :slight_smile:
image

This is embarrassing but I don’t see the chat button (and I have all my ad/beacon blockers disabled). Is it only present on certain pages? Update: found it. Sent screenshot.

I got it, this is exactly what we are working right now :slight_smile: We call it Entity View. I hope in a months or two it will be possible to show/hide fields there.

2 Likes

Sounds great: so presumably I can expect that at that point I’ll be able to configure my entity views to only show the fields I care about, and that will make it render much more quickly?

2 Likes

Yes, but we will improve this View performance as well.

1 Like

I ran into performance issues when leveraging the zendesk integration for some relevant discussion.

The issue I ran into was that it is difficult to apply the filtering before you are hit with the browser trying to load all entities. At one point the browser was crashing before I could apply enough filtering so that it wouldn’t crash. It does appear to be much better now that only the name field is included by default, but I also don’t have near as many items in the database now.

Ultimately, it seems there must be paging of some kind added at some point. There are just too many HTML elements on the page, which is what drives up the CPU usage. That can happen from very wide, complicated tables and/or many rows and would be made worse the more complicated the columns are. This is also why apps that must have large tables (e.g., google sheets) have tables that use canvas, rather than HTML. However, fibery has many view types, so leveraging canvas for one view I imagine is not a big priority.

Even if all the data is still delivered to the client, not creating all the HTML elements for all the items would improve the situation quite a bit. As the number of HTML elements increases on a page, everything will slow down.

2 Likes

We addressed this issue already here → Limit max amount of entities in Board/Calendar/List Views

Great, yeah maybe that explains why I’m not seeing the same performance issues anymore.

Great discussion. Throwing in some findings from my end. Fibery is heavy on Javascript which requires lots of CPU. Screen-sharing within Slack etc is also heavy on the CPU. My MacBook Pro from 2017 struggles a lot with this setup. We are turning to Google Meets which is easier on the CPU when doing screen-sharing. This frees up enough CPU to run Fibery pretty good. I’ve heard the new Apple M1 processors are brilliant at Javascript so perhaps that’s an upgrade to aim for. In the meantime I am confident that Fibery will optimize their code as much as they can.

2 Likes

@mdubakov: I have a test case that I ran into multiple issues with in regards to performance I can share.

We publish content and we have our own simplified category tree. We often post deals from amazon and I wanted to map the amazon category tree to our own. In something like airtable, it is difficult to represent this hierarchy, so I decided to try the hierarchical list in fibery, which is especially well suited to this.

I cleaned up the category data, which is a list of each category (not nested) with about ~60k rows, and each category contains a list of its children (not super useful in this case). However, there is also a field with the parent category’s id.

The issues I ran into in regard to performance:

csv import

  • the csv importer really really struggles with this number of rows and makes fibery unresponsive across all my tabs. You don’t know if it has hung or whether it is still making progress
  • I split the csv into multiple chunks of 5k rows and I was able to get that working
  • at some point, one of the csv imports stopped halfway through, so i wasn’t sure how to keep importing them and not ending up with duplicates. There is no way to sort by the public-id in fibery, so it was hard to tell what was the last item actually imported from the csv to know where to start again
  • while importing via csv or via the API with a tab open, it seems like there is no initial limit on the number of rows shown in the views until you reload the tab. So, during the import having a view of the data being imported open seems to cause the browser to hang
  • I got partially through this with about 20k categories loaded and the hierarchical list was actually working great after the relations all were processed

deleting

  • So, I decided to scrap the idea of csv importing because I was in an unknown state, so I decided to delete all the rows and start over via the api
  • I couldn’t find a way to efficiently delete all the rows without deleting the entire database
  • I could select all in a table view, but then the page would hang
  • I had to create filters for arbitrary chunks of items that were ~1k items to get the delete to work without hanging my browser

auto-relations

  • The ux of the auto-relations to the same table is quite confusing in this case. I thought i had it right, then ended up confused if I was seeing the auto-relations in the middle of reprocessing or not


questions

I know this use case is on the extreme side, but it is not massive in terms of the amount of data. It is two numeric columns, then a handful of basic text columns, then an auto-relation to itself. I’m curious to the best practice for loading data like this. Should I be setting the relationship via the API? If so, how would I do that? Load all the data first without relationships, then go back through for a second pass setting the relationships? (otherwise many times the categories won’t exist yet that a particular category would be related to)

You can download the csv file of the amazon category tree here.

1 Like

It’s a tiny thing to mention, but giving the database a name in the singular sometimes makes it easier to understand how a relationship is working (when one-to-many).

Thank you for reporting this. I have been running into similar problems, particularly with CSV import performance (which seems entirely unnecessary - no need to load the whole thing into memory just to set up the field mappings!), but hadn’t yet made the time to post the problem. I almost think it should be a separate post, but the important thing is that it’s documented. I’ll add details of my situation at some point if it seems relevant, but the problems are basically the same.

The process of field mapping includes a check that the data in the CSV file is compatible with the chosen type.
So, if for example there is text in the CSV file in a column that is to be mapped to a number field then this will be flagged.

I think that’s why the whole dataset is loaded into memory.
Of course, if the incompatible data is beyond the 100th row, the mismatch won’t be visible :confused:

It seems reasonable therefore to only load the first 100 rows, I suppose.
Having said that, I don’t know that this would solve all issues with importing large files - there may be problems with the browser freezing during the actual import.

1 Like

In my very humble non-developer opinion this is something that 100% should be done server-side. I mean obviously part of it is now, ultimately. More what I mean is I would send a CSV by selecting a file and clicking Import, it sends just the file to the server as efficiently as possible, the server munches on it, sends back to my UI just the info pertinent to me (as you note, field mapping issues, etc.), I select options from a very smooth, responsive UI, and when I click Finalize Import (or whatever) it sends my choices back to the server where it has my CSV ready to parse at its leisure, as server resources allow, etc. But critically, none of this should have any real effect on me in my browser for larger data sets. If Fibery wants to replace a multitude of other tools it will need more robust data ingest that doesn’t rely so much on user’s local hardware and browser resources.

I know that to-date Fibery has tried to do a lot of work on the user side. It seems to have mostly worked. But we’re starting to see the performance issues that can result from that.

4 Likes

Yeah, this was actually my first time trying the csv importer out. Overall, it is super easy to use and helps provide feedback about importing, but once you click “import” on large files you enter an unknown state where you don’t know what is going on. The client-side aspect of it makes it so that you also don’t know the state of the import if your browser crashes in the middle of it.

Even still, I don’t see why the client-side import of it has such an issue with this file. 60k rows is not crazy. I have 32GB of ram on a 2019 MacBook pro, so the device is very capable. 13MB of csv data should be able to be iterated through without freezing the client. I do agree though that ideally this would be sent to some kind of background task in some way or another that you can monitor the progress of while still using fibery in other areas.

Yeah, i try to give singular names in general, but forgot here. I did try this approach with our category tree I was trying to map to the amazon category tree. So, this is a similar import, but we only have ~450 categories. Same approach though where there is a reference to the parent on each row. I have the database name as DN Cat as a shortened version of <our company name> category. With the full-length word of category in there, the text truncates for me, so you can’t see the ending of the words.

I think even with the improved naming, there is room for confusion. This is tangentially related to performance, but as the numbers of items gets large, the uncertain aspects of the UX tend to make things worse.

Where things still are confusing is around the auto-relationship portion. Because the second line of the relationship setup has the same naming "DN Cat to DN Cat (arrow pointing up, meaning the parent cat)," I assumed it was referring to the same order as defined above (red arrows below). So, DN Cat (child) to DN Cat (parent), which means it would mean the relationship would be defined as DN Cat (child).parentcategoryid=DN Cat (parent).categoryid.

However, the configuration pictured above is not the correct order. I had to reverse the auto-relationship to have categoryid on the left side and parentcategoryid on the right side to get the relationship setup correctly.


Anyway, yeah I think some of these issues could use splitting off, but wanted to give kind of a complete account of the friction points I came across while trying to import this particular data set, which should give a good hierarchical test case.

2 Likes