I thought I might save others the headache I endured the last several days in my attempt to move my plex installation to AWS, backed by Amazon Cloud Drive (unlimited), running in a container.
This should apply to anyone that uses some type of "less than local ssd" storage for their .db file (/config or /var/lib/plex/Library). In my case I'm using AWS EBS volumes which are reasonably fast, but slow enough my plex server hated them (even when trying provisioned iops with ebs optimized instances).
After moving my install to the cloud, the plex server was functional, but would stall, time out, freeze, not play videos most the time, etc. -- just generally unusable -- definitely a far cry from what I was used to.
New Server Details:
General: Cloud Hosted Container Running on Kubernetes
Provider: AWS
Region: East 1 Region (same region as Amazon Cloud Drive to avoid pesky transfer fees)
Instance Type: r4.xlarge (using spot instance at $0.03 max bid per hour -- will probably move to r4.large now that I've solved my problem)
Docker Image: tcf909/plex (based on official Plex image, but with some tweaks like using DNS name in my announce environmental variable, etc)
Container Environment: Kubernetes 1.4.8 (setup using kops 1.5 beta -- super easy)
Storage:
/config: 120GB EBS GP2
/transcode: tmpfs (memory backed)
/data: rclone mounted Crypt instance backed by Amazon Cloud Drive (unlimited encrypted storage)
The prospect of the new environment was really cool. In practically, after some work, although everything was "functional" nothing was really "usable". Videos wouldn't play after trying and timing out, responses were extremely delayed when browsing through the library -- it was just horrible. I spent time and checked every single component: storage speed (for config, transcode and data), rclone performance, amazon cloud drive, encryption performance, network speed, cpu speed, etc... Everything individually seemed to check out.
Finally, after spending days troubleshooting resources, I finally compared the plex logs to my working server and noticed that some of the messages that cried about busy database (which I assumed was normal when trying to scan a large library) was not in fact something I saw on my local server under the same conditions.
After googling a bit, I realized that the SQLITE database was in fact not responding as quickly as "something" (plex media scanner, media server, etc) thought it needed to and so that seemed to be the bottle neck for EVERYTHING. The problem is EVERYTHING, including starting a video, transcoding, updating your running time as your playing the vide -- I mean every single aspect of interacting with plex seems to rely on the SQLITE database. When it was locked, not being unlocked in a timely fashion or not performing well, this meant that nothing worked the way it should.
I tested this theory by sticking the database files themselves in a ramdisk (Plugins Support\Databases*) and the problems I was having instantly went away.
Ultimately, keeping the .db files in a ram disk wasn't a solution as this would not provide any persistence. Rather than that, after some research I found a PRAGMA cache setting for SQLITE databases that solved my problem.
The following is a rough set of steps I went through to change the PRAGMA default_cache_size in my environment. I'm using containers, but the steps are easily translated to any environment.
1) stopped the plex service (docker exec -it container /bin/bash then in the container s6-svc -d /var/run/s6/services/plex)
2) Double check to make sure all related plex services are stopped (ps auxwwwwf)
3) kill any remaining plex services (kill $PID)
4) Install sqlite tools (apt update && apt install sqlite)
5) modify the pragma setting:
- (sqlite3 /config/Library/Application Support/Plex Media Server/Plug-in Support/Databases/com.plexapp.plugins.library.db)
- (PRAGMA default_cache_size = 20000;)
- (ctrl-d)
6) start plex back up (s6-svc -u /var/run/s6/services/plex)
Now, unfortunately I read that default_cache_size is a depreciated option. The proper way to fix this would be to set cache_size on the initiator connection (within the actual plex code). The default value is 2000 (feel free to google this if your interested in knowing what this translates to ram usage wise). I opted for 10x and most of my problems went away. I have about 6TB of library, and will probably go to 20x or 30x as I progressively add the entire 6TB.
If your logs look like:
==> /config/Library/Application Support/Plex Media Server/Logs/Plex Media Server.log <==
Jan 17, 2017 00:25:45.384 [0x7f93d6ffd700] WARN - Waited one whole second for a busy database.
Jan 17, 2017 00:27:17.590 [0x7f93c4ffe700] WARN - Waited one whole second for a busy database.
Jan 17, 2017 00:27:18.604 [0x7f93c4ffe700] ERROR - Failed to begin transaction (../Statistics/StatisticsManager.h:191) (tries=1): Cannot begin transaction. database is locked
Jan 17, 2017 00:27:50.697 [0x7f93c4ffe700] WARN - Waited one whole second for a busy database.
Jan 17, 2017 00:28:22.069 [0x7f93c4ffe700] WARN - Waited one whole second for a busy database.
Jan 17, 2017 00:28:23.084 [0x7f93c4ffe700] ERROR - Failed to begin transaction (../Statistics/StatisticsManager.h:191) (tries=1): Cannot begin transaction. database is locked
Jan 17, 2017 00:28:24.441 [0x7f93c4ffe700] WARN - Waited one whole second for a busy database.
Jan 17, 2017 00:28:25.454 [0x7f93c4ffe700] ERROR - Failed to begin transaction (../Statistics/StatisticsManager.h:191) (tries=2): Cannot begin transaction. database is locked
Jan 17, 2017 00:29:28.383 [0x7f93d6ffd700] WARN - Waited one whole second for a busy database.
Jan 17, 2017 00:29:29.397 [0x7f93d6ffd700] ERROR - Failed to begin transaction (../Statistics/StatisticsManager.h:191) (tries=1): Cannot begin transaction. database is locked
Jan 17, 2017 00:29:31.067 [0x7f93d6ffd700] WARN - Waited one whole second for a busy database.
Jan 17, 2017 00:29:32.081 [0x7f93d6ffd700] ERROR - Failed to begin transaction (../Statistics/StatisticsManager.h:191) (tries=2): Cannot begin transaction. database is locked
Jan 17, 2017 00:31:34.466 [0x7f93baff1700] WARN - Waited one whole second for a busy database.
Jan 17, 2017 00:31:35.480 [0x7f93baff1700] ERROR - Failed to begin transaction (../Statistics/StatisticsManager.h:191) (tries=1): Cannot begin transaction. database is locked
Jan 17, 2017 00:31:37.235 [0x7f93baff1700] WARN - Waited one whole second for a busy database.
Jan 17, 2017 00:31:38.250 [0x7f93baff1700] ERROR - Failed to begin transaction (../Statistics/StatisticsManager.h:191) (tries=2): Cannot begin transaction. database is locked
Jan 17, 2017 00:31:39.774 [0x7f93baff1700] WARN - Waited one whole second for a busy database.
Jan 17, 2017 00:31:40.789 [0x7f93baff1700] ERROR - Failed to begin transaction (../Statistics/StatisticsManager.h:191) (tries=3): Cannot begin transaction. database is locked
Jan 17, 2017 00:32:13.765 [0x7f93c4ffe700] WARN - Waited one whole second for a busy database.
Jan 17, 2017 00:34:47.468 [0x7f93baff1700] WARN - Waited one whole second for a busy database.
Jan 17, 2017 00:34:48.483 [0x7f93baff1700] ERROR - Failed to begin transaction (../Statistics/StatisticsManager.h:191) (tries=1): Cannot begin transaction. database is locked
Jan 17, 2017 00:34:50.241 [0x7f93baff1700] WARN - Waited one whole second for a busy database.
Jan 17, 2017 00:35:21.287 [0x7f93d27fd700] WARN - Waited one whole second for a busy database.
Jan 17, 2017 00:35:22.302 [0x7f93d27fd700] ERROR - Failed to begin transaction (../Statistics/StatisticsManager.h:191) (tries=1): Cannot begin transaction. database is locked
Jan 17, 2017 00:35:23.879 [0x7f93d27fd700] WARN - Waited one whole second for a busy database.
Jan 17, 2017 00:35:55.076 [0x7f93d27fd700] WARN - Waited one whole second for a busy database.
Jan 17, 2017 00:36:26.132 [0x7f93d27fd700] WARN - Waited one whole second for a busy database.
Jan 17, 2017 00:36:58.152 [0x7f93d27fd700] WARN - Waited one whole second for a busy database.
Jan 17, 2017 00:37:30.121 [0x7f93baff1700] WARN - Waited one whole second for a busy database.
Jan 17, 2017 00:37:31.136 [0x7f93baff1700] ERROR - Failed to begin transaction (../Statistics/StatisticsManager.h:191) (tries=1): Cannot begin transaction. database is locked
Jan 17, 2017 00:37:32.547 [0x7f93baff1700] WARN - Waited one whole second for a busy database.
This can most likely help you.
Please feel free to PM me with any questions. Hopefully this helps someone else out.